LLM vs SLM: Why Small Language Models?

Large Language Models (LLMs) and Small Language Models (SLMs) represent distinct approaches to natural language processing. LLMs are massive models trained on vast amounts of text data, enabling them to generate human-quality text, translate languages, write different kinds of creative content, and answer your questions in an informative way. However, their size necessitates substantial computational resources.   

In contrast, SLMs are smaller models trained on more focused datasets. This makes them computationally efficient and suitable for specific tasks. They often excel in particular domains or applications.

Use Cases for LLMs

  • Content generation: Creating various text formats, from articles to code.
  • Machine translation: Translating between different languages.
  • Chatbots and virtual assistants: Providing interactive and informative conversations.
  • Summarization: Condensing long texts into shorter summaries.

LLMs excel at generating diverse text formats, from marketing copy and social media content to scripts, poems, and translations. They can also provide informative answers to a wide range of questions.

Use Cases for SLMs

  • Domain-specific tasks: Excelling in tasks requiring specialized knowledge, such as medical or legal text processing, as well as code-specific tasks
  • Resource-constrained environments: Operating efficiently on devices with limited computational power.
  • Faster training and deployment: Shorter development cycles compared to LLMs.

SLMs demonstrate strengths in specific text-based applications, excelling in tasks such as sentiment analysis, text classification, and named entity recognition. They can also be tailored for specialized domains like healthcare or finance. Additionally, SLMs can be adapted to support niche programming languages, providing solutions for specific development challenges.

Trade-offs and Considerations

LLMs demand substantial computational resources for training and deployment, reflecting their complexity and size. In contrast, SLMs are more efficient due to their smaller scale. While LLMs often excel in diverse language tasks, SLMs can be specialized for specific domains. Data requirements also differ significantly, with LLMs needing vast datasets and SLMs operating on smaller, focused collections. Ultimately, the choice between an LLM and an SLM hinges on factors such as computational budget, performance, and the nature of the target app.

Hybrid Approaches

Hybrid approaches to language models combine the strengths of large language models (LLMs) and smaller, more specialized language models (SLMs). Transfer learning involves utilizing a pre-trained LLM as a foundation and adapting it to specific tasks through fine-tuning on domain-specific data. This approach benefits from the knowledge captured in the base LLM while tailoring the model to the target domain. Model distillation compresses a large LLM into a smaller, more efficient SLM while preserving key functionalities. This technique enables deployment in resource-constrained environments without significant performance degradation. By strategically combining LLMs and SLMs, organizations can develop robust and adaptable language models capable of handling a wide range of tasks.

Run Code LLMs Locally Without the Cloud: 4 User-Friendly Tools (Infographic)

Learn how to run Code LLMs locally with ease using these 4 tools designed for experimentation and exploration of AI technology on your local computer or on-premise server.

LM Studio: Discover a beginner-friendly interface with drag-and-drop functionality for basic code generation tasks.

Ollama: Immerse yourself in an interactive environment tailored for exploring diverse models and functionalities with ease.

Transformers Pipeline from Huggingface: Harness the power of a robust command-line interface suited for advanced users seeking customization options.

Transformers Models from Huggingface: Enjoy maximum flexibility by directly loading and utilizing Code LLM models from local or remote storage, empowering developers with seamless integration.

If you’re more into the technical side, in this blog post I explain how to set up and run a CodeT5 LLM in Python, which you can easily run on locally on your laptop or on premise.

6 Python Libs to make a Powerful AI Training Stack

PyTorch

PyTorch is a deep learning framework that is based on the Torch library. It is a powerful and flexible framework for building and training neural networks, and it is particularly well-suited for GPU-accelerated training.

Powerful and flexible deep learning framework. 

Website pytorch.org , PyTorch on GitHub

Keras

Keras is a high-level neural network API that is built on top of TensorFlow and PyTorch. It provides a user-friendly interface for defining neural network architectures and training models, making it a popular choice for beginners and experienced practitioners alike.

User-friendly interface for defining neural network architectures and training models.

Website keras.io, Keras on GitHub

Pandas

Pandas is a Python library for data manipulation and analysis. It provides a powerful set of tools for data cleaning, transformation, and analysis, making it an essential tool for working with structured data in machine learning.

Strength: Powerful data manipulation and analysis for structured data.

Website pandas.pydata.org, Pandas on GitHub

Numpy

NumPy is a numerical Python library that provides efficient data structures and operations for numerical computing. It is essential for working with large datasets, especially in scientific computing and machine learning.

Strength: Fast numerical data manipulation and operations.

Website numpy.org, Numpy on GitHub

Matplotlib

Matplotlib is a Python library for creating 2D plots and visualizations. It is a versatile tool for visualizing data, and it is widely used in machine learning for tasks such as data exploration, model evaluation, and communication of results.

Versatile tool for creating 2D plots and visualizations.

Website matplotlib.org, Matplotlib on GitHub

Huggingface Transformers

Transformers is a library for constructing, utilizing, and adapting transformer-based NLP models, enabling NLP practitioners to build effective NLP applications tailored to their specific needs.

Website huggingface.co, Transformers on GitHub

Together, these libraries form a powerful and versatile AI training stack and are widely used in Machine Learning and LLMs.

Was bedeutet ‘nit’ in Code Reviews?

Was bedeutet nit in einem Code Review?

Gelegentlich können wir in Codereviews Kommentare wie diese finden:

auto result = std::make_pair<uint64_t, std::string>(64, "Hallihallo");;

nit: double semicolon

In einem Review bezeichnet “Nit” eine kleine Ungenauigkeit oder einen Fehler, der die Funktionalität des Codes nicht wesentlich beeinträchtigt, aber dennoch korrigiert werden sollte. Zum Beispiel ein Tippfehler in einem Kommentar, ein zu viel gesetzter Semikolon oder eine zusätzliche Leerzeile. Der Prüfer weist auf diesen Fehler hin, möchte aber wahrscheinlich nicht, dass Ihr den Pull-Request aufgrund dieser Kleinigkeit verzögert.

Und so sollten wir damit umgehen: Wenn Du noch an dem PR arbeitest, kannst Du das in einem der nächsten Commits beheben. Verzögere aber nicht die Integration der Funktion oder des Bugfixes nicht wegen dieser Kleinigkeit. Wir alle wissen, dass das Warten auf den CI Zeit in Anspruch nehmen kann, und wenn Du den CI für diese Kleinigkeit blockierst, werden einige Leute wahrscheinlich nicht sehr glücklich sein.

Um mehr über die in Code Reviews verwendete “Slang” zu erfahren, habe ich in diesem Blogpost (englisch) eine Liste zusammengestellt, in der Du Erklärungen zu Abkürzungen wie +1, WIP, lgtm und anderen findest.

Die englische Version dieses Posts findest Du hier.

Fix: No GPU support in Tensorflow

I came across a problem where my Tensorflow installation did not recognize the installed gpu, despite of Cuda and Nvidia drivers being installed properly.

test:

python3 -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”

returned an empty list. Furthermore, it tells it cannot find the cuda library:

2024-01-30 14:57:42.015454: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.

Output of the Nvidia tool is correct and shows Cuda is installed:

nvidia-smi

ubuntu@ip-bla-foo:~/build-nb$  nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

Which tells us it is version 12. Ahhh!💡

Now, 12 is a version from 2023 and my idea was that Tensorflow 2.13 might not know this version, see https://blog.tensorflow.org/2023/11/whats-new-in-tensorflow-2-15.html

Ok, the latest version pip offered was TF 2.13 on Python 3.8. Here is the fix:

  1. upgrade Python: sudo apt install python3.9
  2. a new venv: virtualenv –python /usr/bin/python3.9 ~/.env-python3.9
  3. source ~/.env-python3.9/bin/activate
  4. pip install –upgrade pip
  5. python3 -m pip install tensorflow[and-cuda]==2.15.0.post1

Test: python3 -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”

2024-01-30 15:27:04.458720: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-30 15:27:04.458772: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-30 15:27:04.459601: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-30 15:27:04.465334: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-30 15:27:05.115551: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-01-30 15:27:05.560865: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-30 15:27:05.585883: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-30 15:27:05.586100: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Now we see the GPU in Tensorflow.

Large vs Small LLMs – Thoughts

If you are working on a task that is very specific, a smaller LLM may be able to learn the task-specific patterns more quickly than a larger LLM. Additionally, if you are working on a resource-constrained device, a smaller LLM may be the only option. Read in this blog post how to prepare an LLM for a specific task.

Benefits of large LLMs, such as 70B

Large language models (LLMs) with more parameters are typically trained on larger datasets. The more parameters an LLM has, the more complex it is, and the more data it can process. This is because the parameters represent the connections between the neurons in the LLM’s neural network. The more parameters there are, the more connections there are, and the more complex the network can be.

Benefits of smaller LLMs, such as 6B or 770m

If I have a task that requires Python, I don’t need a model trained on Haskell, GO and Rust. It is not necessary to use a model that is trained on other programming languages. This is because LLMs that are trained on a variety of programming languages can often overfit to the training data, which can make them less effective for generating code in a specific language.

An LLM that is trained on a large dataset of Python, Haskell, Go, and Rust code may be able to generate code in all of these languages. However, it may not be as good at generating idiomatic Python code as an LLM that is specifically trained on Python code.

If you have a task that requires Python, it is generally best to use an LLM that is specifically trained on Python code. This will give you the best chance of generating code that is syntactically correct, semantically meaningful, and idiomatic.

A 6B model is significantly more convenient for many purposes: less expensive to operate, runs on your laptop, maybe more accurate on that specific language if the training data is good.

A good way to decide whether to use an LLM that is trained on multiple programming languages or an LLM that is specifically trained on one programming language is to experiment with both and see which one works better for your task.

Prepare Data for Code LLM Training

If you want to teach your LLM some tricks you need to prepare some training data and run a training (or fine-tuning) on the LLM. For more complex knowledge, this would be a set of a few dozen or even hundred of data pairs: what it is and what it should be. This is called Supervised Learning.

For example: a piece of code, and a description of what the code does. If you write about 100 of these pairs, the LLM will start understanding and be able to explain code it hasn’t seen before. It can also be a piece of code and an instruction: the instruction describes how the given code should be build. As a result, the LLM will be able to write code out of text instructions.

Example:

How much should I write?

You can start seeing results with as little as 100 pairs. But the actual number you will need depends on various factors such as model complexity, data quality, diversity, the complexity of the task or the available training resources.

More complex models might require more data to learn effectively. Higher-quality data can lead to better performance, but it might compensate for a smaller dataset to some extent. A diverse dataset covering various programming languages, problem domains, and styles can enhance the model’s generalization. If the task requires highly nuanced or specialized descriptions, more data might be needed to capture these nuances effectively. The computational resources available for training play a role too; larger datasets might require more computational power and time.

How to Start

Begin with a reasonably sized dataset and monitor the model’s performance. You can then incrementally add more data, observing how the model improves with additional training examples.

As a general rule of thumb, having several thousand pairs of code and descriptions is a good starting point for training a language model effectively. However, this can vary significantly based on the factors mentioned above.

Tools that Help

For once you would need to get a larger set of code snippets from your code base, or something you find on the internet or on GitHub. A useful tool for that is Treesitter. It supports a lot of languages (parsers) from JS, Python, C++ and the like to more esoteric languages such as Erlang, Haskell, Fennel (a Lisp that compiles to Lua). You need your dataset to be somewhat diverse, cover each topic kind of equally such as language datatypes, conditional constructs, I/O etc, talking about a base dataset. When it gets to your specific use cases, identify what is essential and make sure you cover everything.

When you have your list of snippets, you can import them into a tool such as OpenDocString which helps you write the descriptions, balance the topics of your dataset and gives insights on data quality and diversity. The tool is in its early stage, but looks already very promising and makes life much easier.

Once done, you have a larger list of code and descriptions, which you can then feed to your model for training, either using an online service or train it locally on your machine or cloud instance.

ReDoc: Simplifying API Documentation for Open Source Developers

In the world of open source software development, creating user-friendly and informative API documentation is crucial. The right documentation can be the bridge that connects developers to your project, making it more accessible and inviting collaboration. That’s where ReDoc comes into the picture, offering a powerful solution for generating interactive API documentation with ease.

What Is ReDoc?

ReDoc is an open-source tool designed to simplify the process of creating interactive API documentation. It’s tailored for APIs that adhere to the OpenAPI Specification (formerly known as Swagger), which is a widely adopted standard for describing RESTful APIs. ReDoc takes your OpenAPI Specification file and transforms it into visually appealing and user-friendly documentation that developers love.

Why ReDoc?

As open source developers, we’re constantly seeking ways to make our projects more accessible and inviting to the community. High-quality API documentation is a significant part of this effort. Here’s why ReDoc is a game-changer for open source software development:

1. Interactivity: ReDoc creates interactive documentation that enables developers to explore API endpoints and responses in a user-friendly manner. This interactivity keeps users engaged and simplifies their learning experience.

2. A Modern Look: ReDoc offers a clean and modern design for your API documentation. It’s responsive, visually appealing, and aligns with the high standards that open source projects aim for.

3. OpenAPI Compatibility: If your API is described in an OpenAPI YAML or JSON file (and it should be!), ReDoc can seamlessly generate documentation from it. This ensures that your documentation is always in sync with your API.

4. Customization: ReDoc provides various customization options, allowing you to tailor the documentation to match your project’s branding and style. You can adjust colors, fonts, and other design elements to make it your own.

5. Ease of Integration: Integrating ReDoc into your existing documentation infrastructure is straightforward. You can host the generated documentation on your website, making it easily accessible to users.

6. Community and Support: ReDoc boasts an active and growing community of users and contributors. This means you can find support and resources when you need them.

7. Multiple Themes: ReDoc offers multiple pre-designed themes, making it easy to switch between different looks for your API documentation.

How to Get Started with ReDoc

Using ReDoc is as simple as 1-2-3:

  1. OpenAPI Specification: Make sure your API is described in an OpenAPI YAML or JSON file.
  2. Installation: Install ReDoc and specify the location of your OpenAPI Specification file.
  3. Customization: If desired, customize the documentation to match your project’s branding.

Conclusion

In the realm of open source software development, user-friendly API documentation is a non-negotiable aspect of project success. ReDoc empowers open source developers to create captivating and interactive documentation effortlessly. With ReDoc, your API documentation can be the key that invites developers into your project and fosters collaboration.

std::erase, erase_if C++20

In this blog post, we’ll explore how std::erase functions simplify container manipulation and improve code readability.

Dealing with the removal of specific elements from a container in your C++ code can often be a cumbersome and error-prone task. Fortunately, C++20 introduces two powerful allies to streamline this process: std::erase and std::erase_if. These functions bring efficiency and clarity to element removal, and in this article, we’ll explore their workings and benefits.

Why We Need Easy Element Removal

When you’re working with C++ containers like vectors or lists, you often want to kick out some elements based on certain conditions or values. Before C++20, this was like navigating a maze blindfolded. You had to write loops and custom code to find and remove the elements you wanted. Not exactly a picnic, right?

Meet std::erase: The Element Eraser

std::erase is like the Marie Kondo of C++20. It helps you tidy up your container by removing all instances of a specific value. It’s super easy to use and works with various container types—vectors, lists, and even sets and maps. Here’s how it works:

cppCopy code

std::vector<int> numbers = {1, 2, 3, 2, 4, 2, 5}; std::erase(numbers, 2);  // Say goodbye to all those 2s

In this example, we’re saying, “Hey, std::erase, please get rid of all the 2s.” And just like magic, the vector becomes {1, 3, 4, 5}. Neat, right?

Meet std::erase_if: The Selective Element Picker

Now, what if you want to get a bit pickier and remove elements based on custom conditions? That’s where std::erase_if comes in. It’s like having a personal assistant that follows your criteria. Check it out:

cppCopy code

std::vector<int> numbers = {1, 2, 3, 4, 5, 6}; std::erase_if(numbers, [](int n) { returnn % 2 == 0; });  // Adios, even numbers!

In this case, we’re using a cool lambda function as a “picker.” It says, “Bye-bye, even numbers,” and, voilà, we’re left with {1, 3, 5}. std::erase_if lets you customize the removal process based on your whims and fancies.

The Perks of std::erase and std::erase_if

These new additions in C++20 bring some serious perks:

  1. Readability: Your code becomes a breeze to read because these functions spell out your intent—removing elements—right in their names.
  2. Simplicity: One function call does the trick, no more convoluted loops or DIY removal code.
  3. Safety: Standard library functions are your trusty sidekicks, reducing the risk of bugs and odd edge cases in your element removal logic.
  4. Performance: These functions are turbocharged for efficiency, so your code stays zippy even when you’re juggling large containers.

In Conclusion

Say goodbye to the headaches of element removal and embrace the simplicity of std::erase and std::erase_if. They make your code cleaner, more readable, and safer. Whether you’re cleaning up by value or on your custom criteria, C++20 has your back. So go ahead, give these new features a whirl in your C++ projects, and let your code shine. Happy coding! 🚀👨‍💻

Introspect mapbox::value

When you’re working with container types that can hold arrays, objects or values, it might be very beneficial if you could look inside to discover its structure and the value it holds. Using the API might be cumbersome, but luckily we can use the toJson() method to dump its content to the console, like so:

value.toJson()

mapbox::value is a type from the Mapbox GL Native library that can hold a variety of different data types (e.g. integers, strings, arrays, etc.). To introspect a mapbox::value object, you can use the following methods:

  1. type() method: This method returns an enumeration value indicating the type of data stored in the mapbox::value object. For example, if the mapbox::value object holds an integer, the type() method will return mapbox::value_type::number_integer.
  2. get_value_type() method: This method returns a string representation of the type stored in the mapbox::value object. For example, if the mapbox::value object holds a string, the get_value_type() method will return “string”.
  3. get<T>() method: This method allows you to access the underlying value stored in the mapbox::value object. You need to specify the data type T that you expect to retrieve. If the type stored in the mapbox::value object is different from T, a mapbox::util::bad_cast exception will be thrown.
  4. is<T>() method: This method allows you to check if the mapbox::value object holds a value of type T. It returns true if the underlying value is of type T, and false otherwise.

You can use these methods to introspect a mapbox::value object and determine its type and underlying value. Once you have determined the type and value of the mapbox::value object, you can use the appropriate methods to access and manipulate the value as needed.

Let’s dive a bit deeper into how you can leverage these methods.

Using type() to Determine Data Type

Consider a scenario where you have a mapbox::value object, but you’re not sure what type of data it holds. The type() method comes to the rescue. Here’s how you can use it:

mapbox::value myValue = …; // Your mapbox::value object
switch (myValue.type()) {
    case mapbox::value_type::number_integer:
        // Handle integer data
        break;
    case mapbox::value_type::string:
        // Handle string data
        break;
    case mapbox::value_type::array:
        // Handle array data
        break;
    // Add more cases for other data types as needed
}
By examining the result of type(), you can take appropriate actions based on the actual data type.

Getting the String Representation with get_value_type()

Sometimes, you might not need the specific enumeration value returned by type(). Instead, you might prefer a more human-readable representation of the data type. This is where get_value_type() shines:

mapbox::value myValue = …; // Your mapbox::value object

std::string valueType = myValue.get_value_type();

// Now, valueType contains a string representation of the data type.

This string can be helpful for logging, reporting, or simply for making your code more understandable.

Accessing the Underlying Value with get<T>()

To access the actual value contained within a mapbox::value object, you can use the get<T>() method. Specify the data type T that corresponds to the expected data type. If the mapbox::value object doesn’t hold a value of that type, an exception will be thrown. Here’s an example:

mapbox::value myValue = ...;  // Your mapbox::value object

try {
    int intValue = myValue.get<int>();  // Attempt to get an integer value
    // Handle intValue
} catch (const mapbox::util::bad_cast&) {
    // Handle the case where myValue doesn't contain an integer
}

This approach ensures type safety and helps prevent runtime errors caused by incompatible data types.

Checking the Type with is<T>()

Before attempting to access the value with get<T>(), you might want to check if the mapbox::value object actually holds a value of a specific type. This can be done using the is<T>() method:

mapbox::value myValue = ...;  // Your mapbox::value object

if (myValue.is<int>()) {
    // myValue contains an integer
    int intValue = myValue.get<int>();
} else {
    // Handle the case where myValue is not an integer
}

This approach allows you to safely access the value only if it matches the expected type.

Introspecting a mapbox::value object doesn’t have to be a daunting task. By using the type(), get_value_type(), get<T>(), and is<T>() methods, you can confidently explore the contents of these versatile objects. Whether you’re parsing JSON data, working with Mapbox GL Native, or dealing with any other scenario involving mapbox::value, these introspection techniques will be your reliable companions in understanding and handling your data. Happy coding!