How to fine-tune BERT to classify your Slack chats without coding

Press enter or click to view image in full size

BERT figure taken from original paper

Slack chats can become messy with time, proving difficult to extract meaningful information. In this article, I want to present a quick codeless way of fine-tuning and deploying the commonly used BERT classifier to do conversational analysis. We will use that system to extract tasks, facts, and other valuable information from our Slack conversations. It could be easily extended for categorizing any other textual data, like support requests, emails, etc.

What do we want to extract from Slack? It may depend on what your team is using it for, and what type of information is being shared. For us, we’ve found that a lot of todo tasks get lost in the non-stop flow of text, as well as valuable pieces of information that can be further added into the product documentation or needs to be discussed during the daily meetings.

Press enter or click to view image in full size

Machine learning lifecycle from the Label Studio prospective

Label Studio provides an easy way to experiment with NLP machine learning, covering the full ML lifecycle from raw unlabeled data to deployed prediction service in less than one hour. If you want to play around with your Slack chats, follow these simple steps:

Collecting the data

Collecting data from Slack could be done by using Evernote Slack bot integration. It’s easy, just set up and then run /clip in a Slack channel you want to dump. Then you’ll find a corresponding document in the Evernote app:

Press enter or click to view image in full size

Dumped Slack conversations using Evernote app

Then the raw document should be split into separate messages. It is also helpful to save timestamps for further analytics. Create a JSON-formatted file tasks.json with the following list of items:

[{
  "text": "Hello, dear Anton.\nLabel Studio is the basis of our Heartex platform. Everything you describe here is available on the platform. And much more:)",
  "date": "November 15, 2019"
}]

Fine-tuning BERT classifier

You can connect Label Studio to machine learning backend and use it for predictions and retrain when new data is available. Model updates its state each time it receives a new annotated task, and a new state is redeployed for making inference on newly incoming tasks. Once set up, the fine-tuning becomes very easy — only one thing you need to do is to label your tasks — and after a while, you get a working model available via REST API service.

We’ve specifically built an app that integrates Label Studio with a machine learning backend powered by the BERT model from the open-source Transformers library. Assuming you have a docker-compose installed, you can start it with:

git clone https://github.com/heartexlabs/label-studio-transformers
cd label-studio-transformers
docker-compose up

This command launches Label Studio locally at http://localhost:8200. Go to the import page and upload previously created tasks.json file. That’s it, all setup is done, now you can classify your messages from labeling UI, checking whether the model’s suggestions improve over time.

Press enter or click to view image in full size

Fine-tuning BERT classifier with Label Studio.

A few hundreds of messages could be enough, depending on how complex tasks you are willing to solve and what accuracy you would expect. In our case we had classified the messages into five classes: question, issue, suggestion, commit and other. Acceptable quality builds up after three hundred of tasks, and it takes about an hour of annotation. You can also ask your grandma to fine-tune the classifier if you don’t have enough time.

Recognizing chat messages

Since the model was already deployed during the annotation process, you can use right away to predict unknown labels for new messages via REST API:

curl -X POST -H 'Content-Type: application/json' -d '{"text": "Its National Donut Day."}' http://localhost:8200/predict

If everything is OK, you’ll see JSON response:

[
  {
    "cluster": null,
    "result": [
      {
        "from_name": "label",
        "to_name": "text",
        "type": "choices",
        "value": {
          "choices": [
            "Other"
          ]
        }
      }
    ],
    "score": 1.6718149185180664
  }
]

where “Other” is a predicted message type with some “score” (for a more detailed explanation about the format please address Label Studio docs)

Besides just using the model through Label Studio API, you can also find all checkpoints inside the storage/label-studio-ml-backend/model directory and inspect training progress using Tensorboard.

What we have learned

Slack chats are messy, but thank’s to the modern NLP, it becomes easy to put structure into your conversations and get useful insights. For example, you can explore what happens in your conversations during some time period, as we did by aggregating predictions of our messages over a past few months:

Press enter or click to view image in full size

Our Slack conversations over the past few months according to BERT model

What we have concluded by some observing patterns from this chart was:

lots of casual conversations during Christmas and New Year holidays
no casual conversations when there are issues/bug reports
after the code gets committed the discussion of the issues follows
issues are correlated with suggestions, which is obviously normal
suggestions lead to the questions and discussion

What we have actually learned is that spending one hour annotating your texts is enough to build a reasonable ML tool. I believe labeling more data gives your much higher fine-grained accuracy to benefit from.

Conclusion

By creating a Slack analytics application, we have shown how one can start solving complex machine learning tasks by using only data annotation. Instead of writing code, you teach the model to perform the task by showing it an example of how you’d solve it. It is possible to go deeper and use Label Studio with other machine learning frameworks, e.g. fine-tuning some computer vision tasks used for segmentation, pose estimation, scene detection, among others. Would love to hear what ideas and implementations you’d come up with

Happy labeling!