Introducing tidychatmodels for communicating with AI chatbots

AI
I introduce tidychatmodels which is a package I built to communicate with AI chatbots via a common interface.
Author

Albert Rapp

Published

March 10, 2024

You can communicate with AI chatbots like chatGPT via API requests. In R, this can easily be done with the httr2 package. But a lot of chatbots have slightly different API structures so it would be great if there is a common interface for all chatbots. And that’s exactly what tidychatmodels tries to solve.

This package provides a simple interface to chat with your favorite AI chatbot from R. It is inspired by the modular nature of tidymodels where you can easily swap out any ML model for another one but keep the other parts of the workflow the same. In the same vain, this package aims to communicate with different chatbot vendors like openAI, mistral.ai, etc. using the same interface.

Basically, this package is a wrapper around the API of different chatbots and provides a unified interface to communicate with them. The underlying package that handles all the communication is the {httr2} package. For a deep dive into {httr2}, you could check out one of my tutorials on YouTube. And if you want to see the video version of this blog post, you can check it out on YouTube too.

Installation

First thing you need to do is to install the package of course. Currently this package is only available on GitHub. To install it, you will need to use the devtools package. If you don’t have devtools installed, you can install it like any other package.

# install.packages("devtools")
devtools::install_github("AlbertRapp/tidychatmodels")

Getting Started

Another thing you need to get started is an API key from the chatbot vendor you want to use. For example, to use the openAI chatbot, you will need to sign up for an API key here. Once you have that key, you can use it to authenticate with the openAI API. I recommend saving the key into a .env file and loading the key into your R environment using the {dotenv} package. After that, you can create a chat object.

dotenv::load_dot_env('.env')
library(tidyverse)
library(tidychatmodels)
chat_openai <- create_chat('openai', Sys.getenv('OAI_DEV_KEY'))
chat_openai
## Chat Engine: openai 
## Messages: 0

Afterwards, you can add a model to the chat object. In this case, we are adding the gpt-3.5-turbo model. The user is responsible for knowing which models are available at a vendor like OpenAI.

chat_openai |>
  add_model('gpt-3.5-turbo')
## Chat Engine: openai 
## Messages: 0 
## Model: gpt-3.5-turbo

Similarly, you can add parameters to the chat object. These help to customize how the AI chatbot responds. For example, setting the temperature to a value closer to zero tells the chatbot to be more precise rather than creative.

create_chat('openai', Sys.getenv('OAI_DEV_KEY'))|>
  add_model('gpt-3.5-turbo') |>
  add_params('temperature' = 0.2)
## Chat Engine: openai 
## Messages: 0 
## Model: gpt-3.5-turbo 
## Parameters: 
##    temperature: 0.2

Afterwards, you can add messages to your chat object using different roles. Typically, you might first use a system message to set the stage for what your bot is required to do. Afterwards, you can add a user message.

Here, let us build a chatbot that summarizes subtitles that we downloaded from Youtube. For example, I’ve downloaded the subtitles from my last httr2 video. Let’s load that and then pass that into the user message.

subtitle <- read_lines('httr2-guide.txt') |> 
  paste(collapse = ' ') 
# show first 200 characters
subtitle |> str_sub(end = 200)
## [1] "web apis are everywhere and making  requests to them is often a necessary  part in a data analysis to even get the  data in the first place and in the age  of AI you will also need to make API  reques"

chat_openai <- create_chat('openai', Sys.getenv('OAI_DEV_KEY'))|>
  add_model('gpt-3.5-turbo') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'system',
    message = 'You are a helpful subtitle summarizer.
    You goal is to summarize YouTube subtitles into 5 takeaways.
    You receive subtitles and return a summary in the following form:
    
    TAKEAWAY 1: < insert takeaway 1 >
    TAKEAWAY 2: < insert takeaway 2 >
    TAKEAWAY 3: < insert takeaway 3 >
    TAKEAWAY 4: < insert takeaway 4 >
    TAKEAWAY 5: < insert takeaway 5 >'
  ) |> 
  add_message(
    # default role = 'user'
    message = subtitle
  ) 
chat_openai
## Chat Engine: openai 
## Messages: 2 
## Model: gpt-3.5-turbo 
## Parameters: 
##    temperature: 0.2

At this stage, you haven’t actually started any chat with the bot. You can do so by calling the perform_chat method. Beware that this will consume your API calls and will likely incur costs. Once the chat is performed, you can extract the chat from the chat object.

chat_openai <- chat_openai |> perform_chat()
msgs <- chat_openai |> extract_chat(silent = TRUE)
msgs
## # A tibble: 3 × 2
##   role      message                                                             
##   <chr>     <chr>                                                               
## 1 system    "You are a helpful subtitle summarizer.\n    You goal is to summari…
## 2 user      "web apis are everywhere and making  requests to them is often a ne…
## 3 assistant "TAKEAWAY 1: Web APIs are essential for data analysis as they provi…

Here, I’ve set the silent parameter to TRUE so that the full chat isn’t printed into the console. But the function returns a tibble of messages. This allows us to see the summary of our text.

msgs$message[3] |> cat()
## TAKEAWAY 1: Web APIs are essential for data analysis as they provide access to data, and making requests to them is a common practice.
## TAKEAWAY 2: The HTTR 2 package simplifies the process of making API requests by providing a consistent workflow regardless of the API being used.
## TAKEAWAY 3: Understanding how to communicate with APIs involves knowing how to structure requests, handle responses, and untangle JSON data formats.
## TAKEAWAY 4: Authentication with APIs, such as using API keys and setting up clients, is crucial for accessing restricted data and services.
## TAKEAWAY 5: Utilizing tools like the curl translate function and environment variables can streamline the process of working with APIs and handling authentication.

Try other text

The reply sounds a bit generic. Probably because we have summarized a tutorial video. Let’s try the subtitles from my posit::conf(2023) talk instead. The nice thing is that we can keep the same workflow and just replace the content in the user message.

subtitle <- read_lines('positconf-talk.txt') |> 
  paste(collapse = ' ') 

chat_openai <- create_chat('openai', Sys.getenv('OAI_DEV_KEY'))|>
  add_model('gpt-3.5-turbo') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'system',
    message = 'You are a helpful subtitle summarizer.
    You goal is to summarize YouTube subtitles into 5 takeaways.
    You receive subtitles and return a summary in the following form:
    
    TAKEAWAY 1: < insert takeaway 1 >
    TAKEAWAY 2: < insert takeaway 2 >
    TAKEAWAY 3: < insert takeaway 3 >
    TAKEAWAY 4: < insert takeaway 4 >
    TAKEAWAY 5: < insert takeaway 5 >'
  ) |> 
  add_message(
    # default role = 'user'
    message = subtitle
  ) |> 
  perform_chat()
msgs <- chat_openai |> extract_chat(silent = TRUE)
msgs$message[3] |> cat()
## TAKEAWAY 1: Adding HTML and CSS to popular R tools like ggplot, gt, Quarto, and Shiny can enhance the visual appeal of outputs.
## TAKEAWAY 2: Incorporating HTML and CSS into data visualizations can help create unique and visually appealing designs.
## TAKEAWAY 3: Tools like ggtext and opt_css allow for customization of elements like color and style within R-generated outputs.
## TAKEAWAY 4: Learning to use HTML and CSS in conjunction with R tools can be a valuable skill for creating more engaging data visualizations.
## TAKEAWAY 5: Sharing customized Quarto themes with others can be done effectively through platforms like GitHub, allowing for collaboration and reuse of styles.

That looks good. ChatGPT seems to extract good takeaways from my talk. But let’s try to make it shorter

Add additional messages

If you want to tell ChatGPT to make the previous output shorter, you will have to

  • append another user message to the chat and
  • send the whole chat again.

Here’s how that looks with tidychatmodels.

msgs <- chat_openai |> 
  add_message('Make this much shorter') |> 
  perform_chat() |> 
  extract_chat(silent = TRUE)
msgs$message[5] |> cat()
## TAKEAWAY 1: Enhance R tool outputs with HTML and CSS for unique designs.
## TAKEAWAY 2: Customization options like ggtext and opt_css improve visual appeal.
## TAKEAWAY 3: Learning HTML and CSS with R tools boosts data visualization skills.
## TAKEAWAY 4: Share Quarto themes via GitHub for collaboration and reuse.
## TAKEAWAY 5: Incorporating HTML and CSS in R tools can create engaging data visualizations.

Switching to another vendor

Let’s recap our full workflow.

chat_openai <- create_chat('openai', Sys.getenv('OAI_DEV_KEY'))|>
  add_model('gpt-3.5-turbo') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'system',
    message = 'You are a helpful subtitle summarizer.
    You goal is to summarize YouTube subtitles into 5 takeaways.
    You receive subtitles and return a summary in the following form:
    
    TAKEAWAY 1: < insert takeaway 1 >
    TAKEAWAY 2: < insert takeaway 2 >
    TAKEAWAY 3: < insert takeaway 3 >
    TAKEAWAY 4: < insert takeaway 4 >
    TAKEAWAY 5: < insert takeaway 5 >'
  ) |> 
  add_message(
    # default role = 'user'
    message = subtitle
  ) |> 
  perform_chat()

You can easily switch so some other vendor now. For example, let’s go for the mistral-large-latest model from Mistral.ai.

chat_mistral <- create_chat('mistral', Sys.getenv('MISTRAL_DEV_KEY'))|>
  add_model('mistral-large-latest') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'system',
    message = 'You are a helpful subtitle summarizer.
    You goal is to summarize YouTube subtitles into 5 takeaways.
    You receive subtitles and return a summary in the following form:
    
    TAKEAWAY 1: < insert takeaway 1 >
    TAKEAWAY 2: < insert takeaway 2 >
    TAKEAWAY 3: < insert takeaway 3 >
    TAKEAWAY 4: < insert takeaway 4 >
    TAKEAWAY 5: < insert takeaway 5 >'
  ) |> 
  add_message(
    # default role = 'user'
    message = subtitle
  ) |> 
  perform_chat()
msgs <- chat_mistral |> extract_chat(silent = TRUE)
msgs$message[3] |> cat()
## TAKEAWAY 1: Albert Rapp, a math PhD student and content creator, discusses the combination of HTML and CSS with popular R tools to create beautiful outputs, focusing on data visualization, tables, and design.
## 
## TAKEAWAY 2: By adding a little HTML and CSS to R tools like ggplot, gt, Quarto, and Shiny, users can create unique and special outputs, making their work stand out.
## 
## TAKEAWAY 3: Albert emphasizes that learning HTML and CSS isn't difficult once you're familiar with R tools, as it often involves simple keyword changes and copy-pasting code snippets.
## 
## TAKEAWAY 4: Quarto, an R tool for creating reports and presentations, can be a gateway to learning HTML and CSS, as it generates HTML and CSS code that users can modify to customize their outputs.
## 
## TAKEAWAY 5: Albert suggests using CSS variables in SCSS files and inspecting web elements to learn HTML and CSS, making it easier to customize Quarto outputs and other web development projects. He also provides a YouTube tutorial for HTML and CSS beginners.

Nice! Switching vendor and model worked pretty smoothly. Currently, only three vendors are supported by tidychatmodels. Namely, openai, mistral and ollama. The latter is a great tool that you can install on your computer to run local LLMs. Let’s try that next.

Use a local model from ollama

Compared to openAI and mistral.ai, ollama’s API is a bit different but on our tidychatmodels interface everything should still works the same or be at least very similar. For example, creating a chat works pretty much the same but doesn’t require an API key.

create_chat('ollama') 
## Chat Engine: ollama 
## Messages: 0 
## Parameters: 
##    stream: FALSE

Notice how there is already a parameter stream that is set to false. This is a change in the API of the ollama chat engine. You see, by default ollama will stream the reply token by token. But {httr2} doesn’t want that (or rather I didn’t bother looking into how to do that with {httr2}). So that’s why by default we set stream to false.

Now, you can add a local model to your chat. So, assume that you (outside of R) pulled a model like llama2

ollama pull llama2

Then you can add models and messages just like before.

chat_ollama <- create_chat('ollama')|>
  add_model('llama2') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'system',
    message = 'You are a helpful subtitle summarizer.
    You goal is to summarize YouTube subtitles into 5 takeaways.
    You receive subtitles and return a summary in the following form:
    
    TAKEAWAY 1: < insert takeaway 1 >
    TAKEAWAY 2: < insert takeaway 2 >
    TAKEAWAY 3: < insert takeaway 3 >
    TAKEAWAY 4: < insert takeaway 4 >
    TAKEAWAY 5: < insert takeaway 5 >'
  ) |> 
  add_message(
    message = subtitle
  ) |> 
  perform_chat()
msgs <- chat_ollama |> extract_chat(silent = TRUE)
msgs$message[3] |> cat()
## 
## In this talk, the speaker discussed how to customize Quarto themes for HTML and CSS beginners. The speaker explained that Quarto generates HTML and CSS code based on SCSS variables, and showed how to inspect the generated code using the web developer view in a web browser. The speaker also mentioned that there are limitations to what can be achieved with HTML and CSS in the R context, but did not provide specific details.
## 
## The audience asked questions about dipping into JavaScript for enhancing apps, and sharing new Quarto themes with others. The speaker replied that while JavaScript can be fascinating, HTML and CSS are often enough to change the styles of a website; however, there may be limitations to what can be achieved with these technologies in the R context. To share new Quarto themes with others, the speaker suggested sharing the SCSS files via GitHub repository, as they person shared their theme on GitHub.
## 
## Overall, the talk provided an introduction to customizing Quarto themes for HTML and CSS beginners, and discussed how to inspect and modify the generated code using the web developer view in a web browser.

Notice that the output isn’t particularly nice. It doesn’t even abide out “TAKEAWAY 1”, “TAKEAWAY 2” structure. This happens because not all LLMs operate like chatGPT and mistral.ai which have been specifically trained to abide the chat format and use things like system messages. With local models, you might have to reformulate your message so that the model only has to complete the text for you.

chat_ollama <- create_chat('ollama')|>
  add_model('llama2') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'user',
    message = glue::glue('I have just heard this talk:

    {subtitle}
    
    And my five takeaways are:'
  )) |> 
  perform_chat()
msgs <- chat_ollama |> extract_chat(silent = TRUE)
msgs$message[2] |> cat()
## 1. Quarto is a powerful tool for building and styling documents with CSS.
## 2. Quarto has a built-in CSS variables feature that allows you to define custom styles and reuse them across multiple documents.
## 3. To use the web developer view in your browser, open the element of the page you want to modify, right-click on the element, and select "Inspect". This will open the web developer view, where you can see the HTML and CSS code for that element.
## 4. To make changes to the styles of an element in the web developer view, you can copy and paste the relevant CSS code into your SCSS file and modify it as needed.
## 5. Sharing Quarto themes with others can be done by sharing the SCSS files on GitHub or another collaboration platform.

See how the output looks more like a list of 5 takeaways? That’s because we inserted the subtitles into the message and ended the message on “my five takeaways are:” You could even use the “TAKEAWAY” format.

chat_ollama <- create_chat('ollama')|>
  add_model('llama2') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'user',
    message = glue::glue('I have just heard this talk:

    {subtitle}
    
    And my five takeaways are:
    
    TAKEAWAY1:'
  )) |> 
  perform_chat()
msgs <- chat_ollama |> extract_chat(silent = TRUE)
msgs$message[2] |> cat()
## HTML and CSS are powerful tools for building web applications, but they have limitations in terms of functionality and performance.
## 
## TAKEAWAY2: JavaScript can enhance the user experience of a web application, but it can also be intimidating and overwhelming for beginners.
## 
## TAKEAWAY3: There are several resources available for learning HTML, CSS, and JavaScript, including online tutorials, books, and video courses.
## 
## TAKEAWAY4: It's important to have a basic understanding of HTML and CSS before diving into JavaScript, as they provide the foundation for building web applications.
## 
## TAKEAWAY5: Sharing Quarto themes with others can be done by sharing the SCSS files on GitHub or other resource-sharing platforms.

Limitations of local LLMs

Of course, you will have to keep in mind that not all models are as powerful as the ones from openAI or mistral.ai. Notice how the results of our local llama2 model are not particularly good. But at least we get a sensible results. If you use even smaller models, you might get a dumb reply. We could try out a small gemma:7b model.

chat_ollama <- create_chat('ollama')|>
  add_model('gemma:7b') |>
  add_params(temperature = 0.2) |>
  add_message(
    role = 'user',
    message = glue::glue('I have just heard this talk:

    {subtitle}
    
    And my five takeaways are:
    
    TAKEAWAY1:'
  )) |> 
  perform_chat()
msgs <- chat_ollama |> extract_chat(silent = TRUE)
msgs$message[2] |> cat()
## TAKEAWAY2:
## TA TAKEAWAY3:
## TA TAKEAWAY4:
## TA TAKEAWAY5:

This might complete the text but is probably not exactly what we’re looking for. Possibly one could finetune all of that more but that’s beyond the introduction to tidychatmodels that we want to do here.

Conclusion

Nice! This concludes our intro of tidychatmodels. This package is still in its infancy and I intend to add more features. So keep an eye out for that. For now, feel free to install the package from GitHub and don’t hesitate to leave a feature request or bug report. And if you found this blog post helpful, here are some other ways I can help you:


Stay in touch

If you enjoyed this post, then you may also like my weekly 3-minute newsletter. Every week, I share insights on data visualization, statistics and Shiny web app development. Reading time: 3 minutes or less. You can check it out via this link.

You can also support my work with a coffee