Skip to content

convmodel

convmodel provides a conversation model based on transformers GPT-2 model 😉

✨ Features ✨

  • Utilizes GPT2 model to generate response
  • Handles multi-turn conversation
  • Provides useuful interfaces to fine-tune model and generate a response from a given context

A simple example of fine-tune GPT-2 model and generate a response:

from convmodel import ConversationModel
from convmodel import ConversationExample

# Load model on GPU
model = ConversationModel.from_pretrained("gpt2")

# Define training/validation examples
train_iterator = [
    ConversationExample(conversation=[
        "Hello",
        "Hi, how are you?",
        "Good, thank you, how about you?",
        "Good, thanks!"
    ]),
    ConversationExample(conversation=[
        "I am hungry",
        "How about eating pizza?"
    ]),
]
valid_iterator = [
    ConversationExample(conversation=[
        "Tired...",
        "Let's have a break!",
        "Nice idea!"
    ]),
]

# Fine-tune model
model.fit(train_iterator=train_iterator, valid_iterator=valid_iterator)

# Generate response
model.generate(context=["Hello", "How are you"], do_sample=True, top_p=0.95, top_k=50)
# Output could be like below if sufficient examples were given.
# => ConversationModelOutput(responses=['Good thank you'], context=['Hello', 'How are you'])

Please refer to Model Training for more details of training.

ConversationModel adopts simple input schema by concatenating each utterance with <sep> token as below. Please refer to Model Architecture Overview for more details.

position 0 1 2 3 4 5 6 7 8 9
word \<sep> Hello \<sep> How are you \<sep> Good thank you
input_ids 50256 15496 50256 2437 389 345 50256 10248 5875 345
token_type_ids 0 0 1 1 1 1 0 0 0 0
attention_mask 1 1 1 1 1 1 1 1 1 1
↓ ↓ ↓ ↓
generated word - - - - - - Good thank you \<sep>

Enjoy talking with your conversational AI 😉