Jack的keras自学历程3——函数式模型

函数式模型可以从函数式编程的知识中了解。(函数式编程)
keras函数式提供的编程模型接口是让用户有定义多输出模型,有向无环图(DAG)或者具有共享层的模型的途径。
换句话说,只要你的模型不是类似VCG(是什么)一样一条路走到黑的模式,或者你的模型需要多个输出的模型,你应该选择函数式模型。它是最广泛的一类模型,序列(序贯)模型知识它的一种特殊情况。

本节假设你对Sequential模型已经比较熟悉。

第一个模型:全连接网络(什么是全连接网络)

Sequential是实现全连接网络的最好方式,我们要从最简单的全连接网络开始.。开始前有几个概念需要弄清:
层对象接受张量为参数,返回一个张量
输入是张量,输出也是张量的框架就是一个模型,该模型通过Model定义
这样的模型可以被训练,就像Sequential一样

from keras.layers import Input, Dense
from keras.models import Model

# This returns a tensor
inputs = Input(shape=(784,))

# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)

# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels)  # starts training

利用函数式模型的接口,我们可以很容易重用已经训练好的模型:可以把模型当作一个层一样,通过提供一个tensorflow来调用它,但是要注意,当你调用一个模型时,你不仅重用了它的结构,页重用了它的权重。

多输入和多输出的模型

函数式模型的一个典型场景是搭建一个多输入、多输出的模型。
以twitter分析为例。
from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model

# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')

# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)

# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)


auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

model.compile(optimizer='rmsprop', loss='binary_crossentropy',
          loss_weights=[1., 0.2])

model.fit([headline_data, additional_data], [labels, labels],
      epochs=50, batch_size=32)

model.compile(optimizer='rmsprop',
          loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
          loss_weights={'main_output': 1., 'aux_output': 0.2})

# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
          {'main_output': labels, 'aux_output': labels},
          epochs=50, batch_size=32)

共享层

还有使用函数式模型的场合是使用共享层的时候。
以微博的数据为例。

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model

tweet_a = Input(shape=(140, 256))
tweet_b = Input(shape=(140, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)

# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)

层“节点”的概念