https://kotlinlang.org logo
c

czyzby

01/10/2018, 12:55 PM
@Sebastian Kwiatkowski A default initialization could be chosen automatically based on the activation function (Xavier for tanh, He for ReLU, etc.). https://arxiv.org/pdf/1704.08863.pdf Most frameworks use the same optimization method for whole network and do not force you to pass it into each layer. Filter widths and heights could be merged into a single parameter
filterSize
(although without a nice tuple/array syntax, Python filter sizes will still look better). I'm a fan of adding activation function to the layer initialization parameters, so basically I'd prefer this:
Copy code
Conv2D(100, kernel_size=(3,3), activation='relu')
Over this:
Copy code
Conv2D(100, kernel_size=(3,3)),
Activation('relu')
When I looked through the examples, the first thing I noticed is that they could definitely use Kotlin default parameters syntax. Don't be afraid to use some magic numbers - the parametrization through local variables makes it actually a bit harder to read. Compare it to Keras examples:
Copy code
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D

model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(seq_length, 100)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)
Although you have all parameters in one place (more or less), I think it's acceptable to use plain numbers in toy example that showcase the framework. Using Kotlin named parameters makes it no less readable.
Copy code
// Current:
convolution(numberFilters, filterWidth, filterHeight, initializationStrategy, optimizationStrategy)

// Rewritten pseudo-code:
convolution(filters=2, width=3, height=3)