Press enter or click to view image in full size
Greetings Neuroscience Community!
I have been invited by the editor of Brain Immune ( https://brainimmune.com ) to write an article about the use of Artificial Intelligence in scientific research. After consideration, I decided that the most appropriate format would be a series of shorter articles, each one focusing on a different aspect of this expansive topic.
This first article serves as an introduction of several important concepts in the fields of Artificial Intelligence (AI) and Machine Learning (ML).
What is AI?
In simple terms, AI is a specialized field of study within the broader field of Computer Science which studies and advances the capabilities of Universal Turing Machines (commonly referred to as “Computers”, named after the pioneer in Computer Science Alan Turing ) to mimic human thinking. The field has enjoyed and continues to enjoy explosive growth with up to fifty peer-reviewed academic papers being published every week
Press enter or click to view image in full size
Whenever you read a popular news article about AI, you are likely to encounter the following terms:
Generative AI: this is a specialized type of an AI tool that can generate textual output as well as images, sounds, even video clips.
Large Language Model (LLM): A type of Generative AI tool that can interact with humans in a human-like fashion. Such models are very powerful in the sense that they mimic human understanding of language and can generate responses that resemble human responses very effectively.
Reasoning LLM: A type of LLM that does not immediately respond to user queries; instead it applies Test-Time Compute and first “thinks” using human language before responding. Such LLMs are significantly more intelligent and capable than plain LLMs.
Prompt: The commonly accepted term for a human-to-LLM interaction is “Prompt”. Whenever you ask an LLM a question or give it an instruction, you “issue a prompt”. This has led to the rise of a new type of profession called “prompt engineering”. I myself offer professional services as a prompt engineer, advertised above.
Token: A unit of output from a text-generating LLM that can also be thought of as a unit of meaning. Simpler words like “I”, “you”, “yes”, etc. are equivalent to a single token. In spoken everyday English, a sentence consisting of five words will contain on average between six and ten tokens. Scientific text on the other hand will have higher token count per sentence, and a sentence written in Shakespearean English will have even higher token count.
Tokens Per Second (TPS): The performance of an LLM is measured in tokens per second which is the rate at which the LLM generates output tokens. It is a function of the size of the LLM, its internal design as well as that of the underlying computer hardware the LLM runs on, as well as the overall load utilization of that hardware.
Training: Training is the process which enables LLMs and other types of Artificial Intelligence tools (various types of Neural Networks, Support Vector Machines, Relevance Vector Machines, others) to perform a certain function. It is a very compute and data — intensive process that takes several months even when running on tens of thousands of instances of powerful computer hardware like Graphic Processing Units. Specifically for LLMs, training is subdivided into two stages — Pre-Training and Post-Training, described below.
Inference: The processing of data by an AI tool that leads to some meaningful output from the tool. In the case of LLMs, inference happens when the LLM responds to a user prompt. Since most of the time inference is driven by interaction with humans (exception being Agentic Swarms) inference tends to be significantly less compute — intensive and therefore cheaper.
Pre-Training: This type of training is applicable to LLMs. This is the process where the LLM is given a large amount of text (split in tokens by a tokenizer) token by token and it makes a prediction of what the next token in the text will be. Through this process the LLM learns to understand and output human languages while abiding by syntax and grammar rules. Pre-training dominates the computational cost of overall training and requires massive input datasets measuring in trillions of tokens.
Post-Training: Post-training includes the process of applying a variety of techniques for increasing the intelligence level of the LLM after it has learned to understand and output human language. New such techniques are being discovered and existing ones improved upon continuously, and this is a very active area of ongoing research.
Training Dataset: The dataset used during the training of various AI tools, including LLMs.
Validation Dataset: The dataset used during and after training to evaluate the correctness of the results generated by the AI tool.
Layer: an abstract architectural unit within Neural Networks and LLMs. A layer contains multiple scalar parameters which are kept in memory during training and inference. These parameters get modified according to sophisticated rules during training. The general rule is that the more layers an AI tool has, the more intelligent thinking process it can emulate.
Parameter Count: The size of neural networks and LLMs is measured in parameters and the size in bytes is a linear function of the number of parameters. Larger LLMs can emulate more intelligent thought processes but also require more computing power during training and inference.
This glossary is not exhaustive, but should serve as an adequate point of reference that will help you understand news articles about AI.
Press enter or click to view image in full size
Thank you for reading! If you enjoyed this article or if you would like me to cover a specific topic or recent development in the field of AI in greater depth, contact the editor of Brain Immune at editor@brainimmune.com.
See you in my next article!
Zlatin Balevsky
March 5th 2025