1

I'm working on implementation of LSTM Neural Network for sequence classification. I want to design a network with the following parameters:

  1. Input : a sequence of n one-hot-vectors.
  2. Network topology : two-layer LSTM network.
  3. Output: a probability that a sequence given belong to a class (binary-classification). I want to take into account only last output from second LSTM layer.

I need to implement that in CNTK but I struggle because its documentation is not written really well. Can someone help me with that?

4

4 回答 4

6

There is a sequence classification example that follows exactly what you're looking for.

The only difference is that it uses just a single LSTM layer. You can easily change this network to use multiple layers by changing:

LSTM_function = LSTMP_component_with_self_stabilization(
    embedding_function.output, LSTM_dim, cell_dim)[0]

to:

num_layers = 2 # for example
encoder_output = embedding_function.output
for i in range(0, num_layers):
    encoder_output = LSTMP_component_with_self_stabilization(encoder_output.output, LSTM_dim, cell_dim)

However, you'd be better served by using the new layers library. Then you can simply do this:

encoder_output = Stabilizer()(input_sequence)
for i in range(0, num_layers):
    encoder_output = Recurrence(LSTM(hidden_dim)) (encoder_output.output)

Then, to get your final output that you'd put into a dense output layer, you can first do:

final_output = sequence.last(encoder_output)

and then

z = Dense(vocab_dim) (final_output)
于 2016-11-15T15:24:33.257 回答
2

here you can find a straightforward approach, just add the additional layer like:

Sequential([
        Recurrence(LSTM(hidden_dim), go_backwards=False),
        Recurrence(LSTM(hidden_dim), go_backwards=False),
        Dense(label_dim, activation=sigmoid)
    ])

train it, test it and apply it...

于 2017-01-23T18:48:08.890 回答
2

CNTK published a hands-on tutorial for language understanding that has an end to end recipe:

This hands-on lab shows how to implement a recurrent network to process text, for the Air Travel Information Services (ATIS) task of slot tagging (tag individual words to their respective classes, where the classes are provided as labels in the training data set). We will start with a straight-forward embedding of the words followed by a recurrent LSTM. This will then be extended to include neighboring words and run bidirectionally. Lastly, we will turn this system into an intent classifier.

于 2017-01-23T21:39:50.057 回答
1

I'm not familiar with CNTK. But since the question has been left unanswered for so long, I can perhaps suggest some advice to help you with the implementation? I'm not sure how experienced you are with these architectures; but before moving to CNTK (which seemingly has a less active community), I'd suggest looking at other popular repositories (like Theano, tensor-flow, etc.)

For instance, a similar task in theano is given here: kyunghyuncho tutorials. Just look for "def lstm_layer" for the definitions. A torch example can be found in Karpathy's very popular tutorials

Hope this helps a bit..

于 2016-08-29T19:16:22.120 回答