项目作者: JRC1995

项目描述 :
Implementation of abstractive summarization using LSTM in the encoder-decoder architecture with local attention.
高级语言: Jupyter Notebook
项目地址: git://github.com/JRC1995/Abstractive-Summarization.git
创建时间: 2017-10-04T15:40:17Z
项目社区:https://github.com/JRC1995/Abstractive-Summarization

开源协议:MIT License

下载


Abstractive Summarization

Based on Seq2seq learning
with attention mechanism, specifically local attention.

Loading Pre-processed Dataset

The Data is preprocessed in Data_Pre-Processing.ipynb

Dataset source: https://www.kaggle.com/snap/amazon-fine-food-reviews

  1. import json
  2. with open('Processed_Data/Amazon_Reviews_Processed.json') as file:
  3. for json_data in file:
  4. saved_data = json.loads(json_data)
  5. vocab2idx = saved_data["vocab"]
  6. embd = saved_data["embd"]
  7. train_batches_text = saved_data["train_batches_text"]
  8. test_batches_text = saved_data["test_batches_text"]
  9. val_batches_text = saved_data["val_batches_text"]
  10. train_batches_summary = saved_data["train_batches_summary"]
  11. test_batches_summary = saved_data["test_batches_summary"]
  12. val_batches_summary = saved_data["val_batches_summary"]
  13. train_batches_true_text_len = saved_data["train_batches_true_text_len"]
  14. val_batches_true_text_len = saved_data["val_batches_true_text_len"]
  15. test_batches_true_text_len = saved_data["test_batches_true_text_len"]
  16. train_batches_true_summary_len = saved_data["train_batches_true_summary_len"]
  17. val_batches_true_summary_len = saved_data["val_batches_true_summary_len"]
  18. test_batches_true_summary_len = saved_data["test_batches_true_summary_len"]
  19. break
  20. idx2vocab = {v:k for k,v in vocab2idx.items()}

Hyperparameters

  1. hidden_size = 300
  2. learning_rate = 0.001
  3. epochs = 5
  4. max_summary_len = 31 # should be summary_max_len as used in data_preprocessing with +1 (+1 for <EOS>)
  5. D = 5 # D determines local attention window size
  6. window_len = 2*D+1
  7. l2=1e-6

Tensorflow Placeholders

  1. import tensorflow.compat.v1 as tf
  2. tf.disable_v2_behavior()
  3. tf.disable_eager_execution()
  4. embd_dim = len(embd[0])
  5. tf_text = tf.placeholder(tf.int32, [None, None])
  6. tf_embd = tf.placeholder(tf.float32, [len(vocab2idx),embd_dim])
  7. tf_true_summary_len = tf.placeholder(tf.int32, [None])
  8. tf_summary = tf.placeholder(tf.int32,[None, None])
  9. tf_train = tf.placeholder(tf.bool)
  1. WARNING:tensorflow:From /home/jishnu/miniconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/compat/v2_compat.py:65: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
  2. Instructions for updating:
  3. non-resource variables are not supported in the long term

Dropout Function

  1. def dropout(x,rate,training):
  2. return tf.cond(tf_train,
  3. lambda: tf.nn.dropout(x,rate=0.3),
  4. lambda: x)

Embed vectorized text

Dropout used for regularization
(https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf)

  1. embd_text = tf.nn.embedding_lookup(tf_embd, tf_text)
  2. embd_text = dropout(embd_text,rate=0.3,training=tf_train)

LSTM function

More info:


https://dl.acm.org/citation.cfm?id=1246450,


https://www.bioinf.jku.at/publications/older/2604.pdf,


https://en.wikipedia.org/wiki/Long_short-term_memory

  1. def LSTM(x,hidden_state,cell,input_dim,hidden_size,scope):
  2. with tf.variable_scope(scope,reuse=tf.AUTO_REUSE):
  3. w = tf.get_variable("w", shape=[4,input_dim,hidden_size],
  4. dtype=tf.float32,
  5. trainable=True,
  6. initializer=tf.glorot_uniform_initializer())
  7. u = tf.get_variable("u", shape=[4,hidden_size,hidden_size],
  8. dtype=tf.float32,
  9. trainable=True,
  10. initializer=tf.glorot_uniform_initializer())
  11. b = tf.get_variable("bias", shape=[4,1,hidden_size],
  12. dtype=tf.float32,
  13. trainable=True,
  14. initializer=tf.zeros_initializer())
  15. input_gate = tf.nn.sigmoid( tf.matmul(x,w[0]) + tf.matmul(hidden_state,u[0]) + b[0])
  16. forget_gate = tf.nn.sigmoid( tf.matmul(x,w[1]) + tf.matmul(hidden_state,u[1]) + b[1])
  17. output_gate = tf.nn.sigmoid( tf.matmul(x,w[2]) + tf.matmul(hidden_state,u[2]) + b[2])
  18. cell_ = tf.nn.tanh( tf.matmul(x,w[3]) + tf.matmul(hidden_state,u[3]) + b[3])
  19. cell = forget_gate*cell + input_gate*cell_
  20. hidden_state = output_gate*tf.tanh(cell)
  21. return hidden_state, cell

Bi-Directional LSTM Encoder

(https://maxwell.ict.griffith.edu.au/spl/publications/papers/ieeesp97_schuster.pdf)

More Info: https://machinelearningmastery.com/develop-bidirectional-lstm-sequence-classification-python-keras/

Bi-directional LSTM encoder has a forward encoder and a backward encoder. The forward encoder encodes a text sequence from start to end, and the backward encoder encodes the text sequence from end to start.
The final output is a combination (in this case, a concatenation) of the forward encoded text and the backward encoded text

Forward Encoding

  1. S = tf.shape(embd_text)[1] #text sequence length
  2. N = tf.shape(embd_text)[0] #batch_size
  3. i=0
  4. hidden=tf.zeros([N, hidden_size], dtype=tf.float32)
  5. cell=tf.zeros([N, hidden_size], dtype=tf.float32)
  6. hidden_forward=tf.TensorArray(size=S, dtype=tf.float32)
  7. #shape of embd_text: [N,S,embd_dim]
  8. embd_text_t = tf.transpose(embd_text,[1,0,2])
  9. #current shape of embd_text: [S,N,embd_dim]
  10. def cond(i, hidden, cell, hidden_forward):
  11. return i < S
  12. def body(i, hidden, cell, hidden_forward):
  13. x = embd_text_t[i]
  14. hidden,cell = LSTM(x,hidden,cell,embd_dim,hidden_size,scope="forward_encoder")
  15. hidden_forward = hidden_forward.write(i, hidden)
  16. return i+1, hidden, cell, hidden_forward
  17. _, _, _, hidden_forward = tf.while_loop(cond, body, [i, hidden, cell, hidden_forward])

Backward Encoding

  1. i=S-1
  2. hidden=tf.zeros([N, hidden_size], dtype=tf.float32)
  3. cell=tf.zeros([N, hidden_size], dtype=tf.float32)
  4. hidden_backward=tf.TensorArray(size=S, dtype=tf.float32)
  5. def cond(i, hidden, cell, hidden_backward):
  6. return i >= 0
  7. def body(i, hidden, cell, hidden_backward):
  8. x = embd_text_t[i]
  9. hidden,cell = LSTM(x,hidden,cell,embd_dim,hidden_size,scope="backward_encoder")
  10. hidden_backward = hidden_backward.write(i, hidden)
  11. return i-1, hidden, cell, hidden_backward
  12. _, _, _, hidden_backward = tf.while_loop(cond, body, [i, hidden, cell, hidden_backward])

Merge Forward and Backward Encoder Hidden States

  1. hidden_forward = hidden_forward.stack()
  2. hidden_backward = hidden_backward.stack()
  3. encoder_states = tf.concat([hidden_forward,hidden_backward],axis=-1)
  4. encoder_states = tf.transpose(encoder_states,[1,0,2])
  5. encoder_states = dropout(encoder_states,rate=0.3,training=tf_train)
  6. final_encoded_state = dropout(tf.concat([hidden_forward[-1],hidden_backward[-1]],axis=-1),rate=0.3,training=tf_train)

Implementation of attention scoring function

Given a sequence of encoder states ($H_s$) and the decoder hidden state ($H_t$) of current timestep $t$, the equation for computing attention score is:

Score = (H_s.W_a).H_t^T

($W_a$ = trainable parameters)

(https://nlp.stanford.edu/pubs/emnlp15_attn.pdf)

  1. def attention_score(encoder_states,decoder_hidden_state,scope="attention_score"):
  2. with tf.variable_scope(scope,reuse=tf.AUTO_REUSE):
  3. Wa = tf.get_variable("Wa", shape=[2*hidden_size,2*hidden_size],
  4. dtype=tf.float32,
  5. trainable=True,
  6. initializer=tf.glorot_uniform_initializer())
  7. encoder_states = tf.reshape(encoder_states,[N*S,2*hidden_size])
  8. encoder_states = tf.reshape(tf.matmul(encoder_states,Wa),[N,S,2*hidden_size])
  9. decoder_hidden_state = tf.reshape(decoder_hidden_state,[N,2*hidden_size,1])
  10. return tf.reshape(tf.matmul(encoder_states,decoder_hidden_state),[N,S])

Local Attention Function

Based on: https://nlp.stanford.edu/pubs/emnlp15_attn.pdf

  1. def align(encoder_states, decoder_hidden_state,scope="attention"):
  2. with tf.variable_scope(scope,reuse=tf.AUTO_REUSE):
  3. Wp = tf.get_variable("Wp", shape=[2*hidden_size,128],
  4. dtype=tf.float32,
  5. trainable=True,
  6. initializer=tf.glorot_uniform_initializer())
  7. Vp = tf.get_variable("Vp", shape=[128,1],
  8. dtype=tf.float32,
  9. trainable=True,
  10. initializer=tf.glorot_uniform_initializer())
  11. positions = tf.cast(S-window_len,dtype=tf.float32) # Maximum valid attention window starting position
  12. # Predict attention window starting position
  13. ps = positions*tf.nn.sigmoid(tf.matmul(tf.tanh(tf.matmul(decoder_hidden_state,Wp)),Vp))
  14. # ps = (soft-)predicted starting position of attention window
  15. pt = ps+D # pt = center of attention window where the whole window length is 2*D+1
  16. pt = tf.reshape(pt,[N])
  17. i = 0
  18. gaussian_position_based_scores = tf.TensorArray(size=S,dtype=tf.float32)
  19. sigma = tf.constant(D/2,dtype=tf.float32)
  20. def cond(i,gaussian_position_based_scores):
  21. return i < S
  22. def body(i,gaussian_position_based_scores):
  23. score = tf.exp(-((tf.square(tf.cast(i,tf.float32)-pt))/(2*tf.square(sigma))))
  24. # (equation (10) in https://nlp.stanford.edu/pubs/emnlp15_attn.pdf)
  25. gaussian_position_based_scores = gaussian_position_based_scores.write(i,score)
  26. return i+1,gaussian_position_based_scores
  27. i,gaussian_position_based_scores = tf.while_loop(cond,body,[i,gaussian_position_based_scores])
  28. gaussian_position_based_scores = gaussian_position_based_scores.stack()
  29. gaussian_position_based_scores = tf.transpose(gaussian_position_based_scores,[1,0])
  30. gaussian_position_based_scores = tf.reshape(gaussian_position_based_scores,[N,S])
  31. scores = attention_score(encoder_states,decoder_hidden_state)*gaussian_position_based_scores
  32. scores = tf.nn.softmax(scores,axis=-1)
  33. return tf.reshape(scores,[N,S,1])

LSTM Decoder With Local Attention

  1. with tf.variable_scope("decoder",reuse=tf.AUTO_REUSE):
  2. SOS = tf.get_variable("sos", shape=[1,embd_dim],
  3. dtype=tf.float32,
  4. trainable=True,
  5. initializer=tf.glorot_uniform_initializer())
  6. # SOS represents starting marker
  7. # It tells the decoder that it is about to decode the first word of the output
  8. # I have set SOS as a trainable parameter
  9. Wc = tf.get_variable("Wc", shape=[4*hidden_size,embd_dim],
  10. dtype=tf.float32,
  11. trainable=True,
  12. initializer=tf.glorot_uniform_initializer())
  13. SOS = tf.tile(SOS,[N,1]) #now SOS shape: [N,embd_dim]
  14. inp = SOS
  15. hidden=final_encoded_state
  16. cell=tf.zeros([N, 2*hidden_size], dtype=tf.float32)
  17. decoder_outputs=tf.TensorArray(size=max_summary_len, dtype=tf.float32)
  18. outputs=tf.TensorArray(size=max_summary_len, dtype=tf.int32)
  19. attention_scores = align(encoder_states,hidden)
  20. encoder_context_vector = tf.reduce_sum(encoder_states*attention_scores,axis=1)
  21. for i in range(max_summary_len):
  22. inp = dropout(inp,rate=0.3,training=tf_train)
  23. inp = tf.concat([inp,encoder_context_vector],axis=-1)
  24. hidden,cell = LSTM(inp,hidden,cell,embd_dim+2*hidden_size,2*hidden_size,scope="decoder")
  25. hidden = dropout(hidden,rate=0.3,training=tf_train)
  26. attention_scores = align(encoder_states,hidden)
  27. encoder_context_vector = tf.reduce_sum(encoder_states*attention_scores,axis=1)
  28. concated = tf.concat([hidden,encoder_context_vector],axis=-1)
  29. linear_out = tf.nn.tanh(tf.matmul(concated,Wc))
  30. decoder_output = tf.matmul(linear_out,tf.transpose(tf_embd,[1,0]))
  31. # produce unnormalized probability distribution over vocabulary
  32. decoder_outputs = decoder_outputs.write(i,decoder_output)
  33. # Pick out most probable vocab indices based on the unnormalized probability distribution
  34. next_word_vec = tf.cast(tf.argmax(decoder_output,1),tf.int32)
  35. next_word_vec = tf.reshape(next_word_vec, [N])
  36. outputs = outputs.write(i,next_word_vec)
  37. next_word = tf.nn.embedding_lookup(tf_embd, next_word_vec)
  38. inp = tf.reshape(next_word, [N, embd_dim])
  39. decoder_outputs = decoder_outputs.stack()
  40. outputs = outputs.stack()
  41. decoder_outputs = tf.transpose(decoder_outputs,[1,0,2])
  42. outputs = tf.transpose(outputs,[1,0])

Define Cross Entropy Cost Function and L2 Regularization

  1. filtered_trainables = [var for var in tf.trainable_variables() if
  2. not("Bias" in var.name or "bias" in var.name
  3. or "noreg" in var.name)]
  4. regularization = tf.reduce_sum([tf.nn.l2_loss(var) for var
  5. in filtered_trainables])
  6. with tf.variable_scope("loss"):
  7. epsilon = tf.constant(1e-9, tf.float32)
  8. cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
  9. labels=tf_summary, logits=decoder_outputs)
  10. pad_mask = tf.sequence_mask(tf_true_summary_len,
  11. maxlen=max_summary_len,
  12. dtype=tf.float32)
  13. masked_cross_entropy = cross_entropy*pad_mask
  14. cost = tf.reduce_mean(masked_cross_entropy) + \
  15. l2*regularization
  16. cross_entropy = tf.reduce_mean(masked_cross_entropy)

Accuracy

  1. # Comparing predicted sequence with labels
  2. comparison = tf.cast(tf.equal(outputs, tf_summary),
  3. tf.float32)
  4. # Masking to ignore the effect of pads while calculating accuracy
  5. pad_mask = tf.sequence_mask(tf_true_summary_len,
  6. maxlen=max_summary_len,
  7. dtype=tf.bool)
  8. masked_comparison = tf.boolean_mask(comparison, pad_mask)
  9. # Accuracy
  10. accuracy = tf.reduce_mean(masked_comparison)

Define Optimizer

  1. all_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
  2. optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
  3. gvs = optimizer.compute_gradients(cost, all_vars)
  4. capped_gvs = [(tf.clip_by_norm(grad, 5), var) for grad, var in gvs] # Gradient Clipping
  5. train_op = optimizer.apply_gradients(capped_gvs)

Training and Validation

  1. import pickle
  2. import random
  3. with tf.Session() as sess: # Start Tensorflow Session
  4. display_step = 100
  5. patience = 5
  6. load = input("\nLoad checkpoint? y/n: ")
  7. print("")
  8. saver = tf.train.Saver()
  9. if load.lower() == 'y':
  10. print('Loading pre-trained weights for the model...')
  11. saver.restore(sess, 'Model_Backup/Seq2seq_summarization.ckpt')
  12. sess.run(tf.global_variables())
  13. sess.run(tf.tables_initializer())
  14. with open('Model_Backup/Seq2seq_summarization.pkl', 'rb') as fp:
  15. train_data = pickle.load(fp)
  16. covered_epochs = train_data['covered_epochs']
  17. best_loss = train_data['best_loss']
  18. impatience = 0
  19. print('\nRESTORATION COMPLETE\n')
  20. else:
  21. best_loss = 2**30
  22. impatience = 0
  23. covered_epochs = 0
  24. init = tf.global_variables_initializer()
  25. sess.run(init)
  26. sess.run(tf.tables_initializer())
  27. epoch=0
  28. while (epoch+covered_epochs)<epochs:
  29. print("\n\nSTARTING TRAINING\n\n")
  30. batches_indices = [i for i in range(0, len(train_batches_text))]
  31. random.shuffle(batches_indices)
  32. total_train_acc = 0
  33. total_train_loss = 0
  34. for i in range(0, len(train_batches_text)):
  35. j = int(batches_indices[i])
  36. cost,prediction,\
  37. acc, _ = sess.run([cross_entropy,
  38. outputs,
  39. accuracy,
  40. train_op],
  41. feed_dict={tf_text: train_batches_text[j],
  42. tf_embd: embd,
  43. tf_summary: train_batches_summary[j],
  44. tf_true_summary_len: train_batches_true_summary_len[j],
  45. tf_train: True})
  46. total_train_acc += acc
  47. total_train_loss += cost
  48. if i % display_step == 0:
  49. print("Iter "+str(i)+", Cost= " +
  50. "{:.3f}".format(cost)+", Acc = " +
  51. "{:.2f}%".format(acc*100))
  52. if i % 500 == 0:
  53. idx = random.randint(0,len(train_batches_text[j])-1)
  54. text = " ".join([idx2vocab.get(vec,"<UNK>") for vec in train_batches_text[j][idx]])
  55. predicted_summary = [idx2vocab.get(vec,"<UNK>") for vec in prediction[idx]]
  56. actual_summary = [idx2vocab.get(vec,"<UNK>") for vec in train_batches_summary[j][idx]]
  57. print("\nSample Text\n")
  58. print(text)
  59. print("\nSample Predicted Summary\n")
  60. for word in predicted_summary:
  61. if word == '<EOS>':
  62. break
  63. else:
  64. print(word,end=" ")
  65. print("\n\nSample Actual Summary\n")
  66. for word in actual_summary:
  67. if word == '<EOS>':
  68. break
  69. else:
  70. print(word,end=" ")
  71. print("\n\n")
  72. print("\n\nSTARTING VALIDATION\n\n")
  73. total_val_loss=0
  74. total_val_acc=0
  75. for i in range(0, len(val_batches_text)):
  76. if i%100==0:
  77. print("Validating data # {}".format(i))
  78. cost, prediction,\
  79. acc = sess.run([cross_entropy,
  80. outputs,
  81. accuracy],
  82. feed_dict={tf_text: val_batches_text[i],
  83. tf_embd: embd,
  84. tf_summary: val_batches_summary[i],
  85. tf_true_summary_len: val_batches_true_summary_len[i],
  86. tf_train: False})
  87. total_val_loss += cost
  88. total_val_acc += acc
  89. avg_val_loss = total_val_loss/len(val_batches_text)
  90. print("\n\nEpoch: {}\n\n".format(epoch+covered_epochs))
  91. print("Average Training Loss: {:.3f}".format(total_train_loss/len(train_batches_text)))
  92. print("Average Training Accuracy: {:.2f}".format(100*total_train_acc/len(train_batches_text)))
  93. print("Average Validation Loss: {:.3f}".format(avg_val_loss))
  94. print("Average Validation Accuracy: {:.2f}".format(100*total_val_acc/len(val_batches_text)))
  95. if (avg_val_loss < best_loss):
  96. best_loss = avg_val_loss
  97. save_data={'best_loss':best_loss,'covered_epochs':covered_epochs+epoch+1}
  98. impatience=0
  99. with open('Model_Backup/Seq2seq_summarization.pkl', 'wb') as fp:
  100. pickle.dump(save_data, fp)
  101. saver.save(sess, 'Model_Backup/Seq2seq_summarization.ckpt')
  102. print("\nModel saved\n")
  103. else:
  104. impatience+=1
  105. if impatience > patience:
  106. break
  107. epoch+=1
  1. Load checkpoint? y/n: n
  2. STARTING TRAINING
  3. Iter 0, Cost= 1.493, Acc = 0.00%
  4. Sample Text
  5. i was given these as a gift ... they were so amazing i now order them for all occasions and sometimes just because i had n't had them in a while . a little warning ; they are completely addictive . i like the <UNK> ones ; my girlfriend likes the rocky road . highly recommended ! < br / > < br / > sure to be appreciated by everyone on your gift list .
  6. Sample Predicted Summary
  7. condolence s.e. foodstuff condolence webbed poverty squarely poverty poverty assists foodstuff webbed poverty methodist foodstuff webbed poverty gephardt foodstuff ethier articulos meh rojos cols colombians webbed poverty condolence poverty condolence hourly
  8. Sample Actual Summary
  9. simply amazing brownies ...
  10. Iter 100, Cost= 0.684, Acc = 26.98%
  11. Iter 200, Cost= 0.649, Acc = 27.19%
  12. Iter 300, Cost= 0.744, Acc = 25.93%
  13. Iter 400, Cost= 0.976, Acc = 19.88%
  14. Iter 500, Cost= 0.839, Acc = 21.53%
  15. Sample Text
  16. for those looking for a <UNK> water beverage and one with a neutral taste that does n't have <UNK> aftertaste , this one 's for <UNK> < br / > < br / > also , traditional tap water is slightly more acidic ( i believe ph 7-8 ) . <UNK> 's is supposed at 9.5 ph , so if you 're very sensitive to acidic products , this might help you out .
  17. Sample Predicted Summary
  18. good
  19. Sample Actual Summary
  20. neutral taste , low ph
  21. Iter 600, Cost= 0.697, Acc = 27.82%
  22. Iter 700, Cost= 0.763, Acc = 24.24%
  23. Iter 800, Cost= 0.792, Acc = 24.82%
  24. Iter 900, Cost= 0.866, Acc = 23.13%
  25. Iter 1000, Cost= 0.838, Acc = 23.03%
  26. Sample Text
  27. i love my starbucks sumatra first thing in the morning . i was not always up early enough to take the detour to starbucks and now i do n't have to ! these <UNK> are perfect and delicious . now i can have my fav coffee even before i take off my slippers ! i love this product ! it 's easy to order - arrived quickly and the price was good .
  28. Sample Predicted Summary
  29. great
  30. Sample Actual Summary
  31. no drive through at starbucks ?
  32. Iter 1100, Cost= 0.648, Acc = 30.58%
  33. Iter 1200, Cost= 0.977, Acc = 19.08%
  34. Iter 1300, Cost= 0.788, Acc = 23.29%
  35. Iter 1400, Cost= 0.681, Acc = 28.23%
  36. Iter 1500, Cost= 0.608, Acc = 29.32%
  37. Sample Text
  38. husband loves this tea especially in the <UNK> recommend using the large cup setting on your keurig brewer unless you prefer your tea extra strong .
  39. Sample Predicted Summary
  40. great tea
  41. Sample Actual Summary
  42. good substitute for coffee .
  43. Iter 1600, Cost= 0.709, Acc = 27.48%
  44. Iter 1700, Cost= 0.729, Acc = 31.11%
  45. Iter 1800, Cost= 0.627, Acc = 28.93%
  46. Iter 1900, Cost= 0.798, Acc = 26.36%
  47. Iter 2000, Cost= 0.856, Acc = 22.08%
  48. Sample Text
  49. can no longer find this product locally anymore . i purchased it previously at a warehouse club but costco , bj ` s and sam ` s club no longer stock it in my area stores . my two golden retriever ` s love this gravy when added to their mix of both dry and moist dog food . hope it stays on the market ... <UNK> !
  50. Sample Predicted Summary
  51. great
  52. Sample Actual Summary
  53. best pet food gravy
  54. Iter 2100, Cost= 0.640, Acc = 30.77%
  55. Iter 2200, Cost= 0.792, Acc = 24.49%
  56. Iter 2300, Cost= 0.735, Acc = 22.86%
  57. Iter 2400, Cost= 0.769, Acc = 21.68%
  58. Iter 2500, Cost= 0.900, Acc = 21.15%
  59. Sample Text
  60. i want to start out by saying that i thought at first that a bag with only 120 calories and 4 grams of fat ( no saturated or trans ) for every 20 chips was going to taste like crap . i must say that not only was i wrong , that this is my favorite bbq chip on the market today . they are light and you can not taste any fat or grease after eating them . that 's because they are n't baked or fried , just popped as their name suggests . these chips are very easy to dip as well . fantastic product !
  61. Sample Predicted Summary
  62. great chips
  63. Sample Actual Summary
  64. fantastic chips ! ! !
  65. Iter 2600, Cost= 0.740, Acc = 22.86%
  66. Iter 2700, Cost= 0.848, Acc = 24.84%
  67. Iter 2800, Cost= 0.677, Acc = 28.57%
  68. Iter 2900, Cost= 0.779, Acc = 25.90%
  69. Iter 3000, Cost= 0.718, Acc = 27.34%
  70. Sample Text
  71. this <UNK> of 7-ounce `` taster 's choice french roast '' canisters , is a good buy . the coffee is flavored differently than original flavor , but the difference is very subtle , and refreshingly good . overall , this taster 's choice coffee is a bargain , and highly recommended .
  72. Sample Predicted Summary
  73. great flavor
  74. Sample Actual Summary
  75. good buy

Future Works

  • Beam Search
  • Pointer Mechanisms
  • BLEU\ROUGE evaluation
  • Implement Testing
  • Complete Training and Optimize Hyperparameters