July 14, 2025 · Ryan X. Charles
tl;dr
I trained a neural network in Nushell by wrapping libtorch, the same C++ library that powers PyTorch, in a Nushell plugin, making it possible to create and process tensors on the command line.
Furthermore, I wrapped an entire web app and web browser in a second Nushell plugin to make rendering beautiful, high resolution plots in the terminal possible.
Together, I used these tools to train a neural network to classify data and visualize the results, all inside my terminal.
I also compared the results to PyTorch and Matplotlib, which match exactly.
Results Preview
The following plot illustrates the neural network that I created with Nushell. I started by generating data points that fall into three buckets. I then used a two-layer neural network to classify the points. The final plot shows both the original “true” data points, and a contour map of the model, showing the model’s predictions, and how well they match the original data.

What is Nushell?
Nushell is a new shell that is designed to handle structured data, unlike traditional shells like zsh and bash, which are focused on text. It is written in Rust and has many features of modern programming languages, including a powerful type system, good errors, and a plugin system.
Why Nushell?
Although Python is a good programming language, I always thought it would be cool if I could pipe data into my GPU directly from the command line. Because Python is not a shell, this is impractical with Python. However, Nushell is a shell, so it is designed to handle structured data and can easily pipe data from one command to another.
Although building a similar proof of concept with zsh or bash is possible, no one would ever want to do data analysis in those languages, because they are too slow for all import/export operations. Nushell, however, handles structured data efficiently, so it is sufficiently fast to pipe complex data from command to command. The tooling for Nushell may not yet be as mature as Python, but in principle it is just as capable of data analysis as Python, and brings a whole new dimension of value to the table by being your shell, and thus working out of the box with all of your existing shell commands and workflows.
How Does it Work?
I created two tools to make this possible: Nutorch and Termplot.
Nutorch
Nutorch is a Nushell plugin written in Rust that wraps tch-rs, which itself is a Rust wrapper for libtorch, the C++ library that powers PyTorch. This plugin allows you to create and manipulate tensors in Nushell, which are the fundamental data structure used in neural networks (and a lot of other data analysis). The API is designed to be similar to PyTorch, so if you are familiar with PyTorch, you will feel right at home. It also has some convenient Nushell-specific features, such as the ability to pipe data directly into tensors from other commands.
The fundamental idea of Nutorch is very simple: You can create tensors, either by piping data into it or generating them from scratch, on any device (CPU or CUDA or MPS). Those tensors are stored in a hash table which lives as long as the Nushell plugin needs to live. Nushell decides when the plugin should exit, which by default lasts 10 seconds since the last command.
Commands then accept tensors as input or arguments, and the tensors are
retrieved from memory and operated on. To view the final result, you can run
torch value which will download the tensor and convert it into a Nushell
value, which can then be printed or piped to another command.
Termplot
Termplot is a Node.js CLI tool that runs an entire headless web browser (via puppeteer) with a web server running a React Router web app with an in-memory database of JSON configuration files to render plotly.js plots (or, in the future, any other web-based plotting library). Termplot takes a screen shot of the plot and renders it to the terminal using ANSI escape codes.
The standard Termplot CLI tool must open an entire web browser for every plot, which is slow. So I also made a Nushell plugin for Termplot that manages the web browser in the background, making plots render almost instantly. The normal CLI tool takes about 2 seconds to render a plot, but the Nushell plugin renders plots in about 20 milliseconds.
beautiful.nu
I have also created a simple Nushell script called beautiful.nu that generates catppuccin-themed plotly.js configuration files, thus bridging Nutorch and Termplot. The beautiful.nu code can be found on NPM here and on GitHub in the Nutorch repo here.
Simple One-Liner Example
As a simple example of the power of Notorch and Termplot, considering the following powerful one-liner:
torch arange 0 10 0.1 --device mps | torch exp | torch value | [{y: $in}] | beautiful lines | termplot
This command generates a tensor of values from 0 to 10 in increments of 0.1,
computes the exponential of each value, downloads the result from the GPU,
generates a “beautiful” plot configuration from the data, and then plots the
result using Termplot. The --device mps flag specifies that the computation
should be done on the Apple M1 GPU (if available). The final result is a
beautiful plot of the exponential function rendered directly in the terminal.

Have you ever seen a one-liner like this does does a computation on the GPU, and renders the result directly in the terminal? This is the power of Nutorch and Termplot working together.
Neural Network Example
I trained a neural network in Nushell using Nutorch and Termplot in the following manner:
- I generated a series of data points that fall into three separate clusters. Each data point corresponds to a label, which is the cluster it belongs to. The labels are integers 0, 1, or 2.
- Knowing it was possible to use a two-layer neural network with ReLU activation to classify the data, I created a Nushell function that initializes the model parameters (weights and biases) for a two-layer neural network with ReLU activation.
- I created a Nushell function that performs a forward pass through the model, which computes the logits (raw outputs) of the model.
- I created a Nushell function that computes the cross entropy loss between the logits and the target labels.
- I created a Nutorch function, called from Nushell, that performs stochastic gradient descent (SGD) to update the model parameters based on the loss.
- I created a Nushell function that trains the model for a specified number of epochs, recording the loss every so often.
- I created a Nushell function that plots the raw data points using Termplot.
- I created a Nushell function that plots the loss over time using Termplot.
- I created a Nushell function that plots the model predictions using Termplot, which shows the decision boundary of the model and how well it matches the original data.
- I programmed the exact same logic in Python using PyTorch and Matplotlib to verify that the results match exactly. Because the underlying library is libtorch, using the same random seed, the results are exactly the same, proving the Nushell code is working exactly as intended. The Python code was written first and used as a template to make sure the Nushell code was complete.
You can see the code for the Nushell neural network example and the corresponding code in Python in the appendix below. You can also find additional information on installing Nutorch and Termplot on their respective GitHub pages:
Visualizing the Results
The following plot shows the initial data points, colored by their class.

The following plot shows the loss over time during training.

The following plot shows the model predictions, with the decision boundary overlaid on the original data points.

These plots were created using Nutorch and Termplot, but similar plots can be produced using the Python code in the appendix below.
Conclusion
Modern technologies including Nushell, libtorch, and tch-rs make it possible to do data analysis and machine learning in a shell environment. This is extremely convenient and powerful for anyone who spends a lot of time in the terminal because it is compatible with all of your existing shell commands and workflows.
A proof of concept neural network was created in Nushell using Nutorch and Termplot, which demonstrates the power of these tools. The neural network was trained to classify data points into three clusters, and the results were visualized using Termplot. The results match exactly with a similar implementation in Python using PyTorch and Matplotlib, proving that the Nushell implementation is working as intended.
To learn more about how to use Nutorch and Termplot, please visit their respective websites, which currently redirect to GitHub:
Appendix A: Nushell Neural Network Example
plugin use torch
plugin use termplot
source node_modules/termplot.nu/termplot.nu
use node_modules/beautiful.nu *
torch manual_seed 42
def generate_data [
--n_samples: int = 300 # Number of samples to generate
--centers: int = 3 # Number of cluster centers
--cluster_std: float = 0.7 # Standard deviation of clusters
--skew_factor: float = 0.3 # Skew factor for data distribution
]: [nothing -> record<X: string, y: string>] {
let n_samples_per_class: int = ($n_samples // $centers)
mut X_list: list<string> = [] # nutorch tensors have string ids
mut y_list: list<string> = [] # nutorch tensors have string ids
let blob_centers: list<string> = [
(torch tensor [0.0 0.0])
(torch tensor [3.0 0.0])
(torch tensor [1.5 2.5])
]
for i in (seq 0 ($centers - 1)) {
mut points: string = (torch randn $n_samples_per_class 2) | torch mul (torch tensor $cluster_std) | torch add ($blob_centers | get $i)
if $i == 1 or $i == 2 {
let center = ($blob_centers | get $i)
let skew = torch tensor [[1.0 ($skew_factor * ($i - 1))] [($skew_factor * ($i - 1)) 1.0]]
$points = $points | torch sub $center | torch mm $skew | torch add $center
}
let labels: string = torch full [$n_samples_per_class] $i --dtype 'int64'
$X_list = $X_list | append $points
$y_list = $y_list | append $labels
}
let X: string = $X_list | torch cat --dim 0
let y: string = $y_list | torch cat --dim 0
{X: $X y: $y}
}
def model_init [
--input_size: int = 2 # Number of input features
--hidden_size: int = 20 # Number of hidden units
--output_size: int = 3 # Number of output classes
]: [nothing -> record<w1: string, b1: string, w2: string, b2: string>] {
{
w1: (torch randn $hidden_size $input_size --requires_grad true)
b1: (torch randn $hidden_size --requires_grad true)
w2: (torch randn $output_size $hidden_size --requires_grad true)
b2: (torch randn $output_size --requires_grad true)
}
}
def model_get_parameters [
--model: record<w1: string, b1: string, w2: string, b2: string>
]: [nothing -> list<string>] {
[$model.w1 $model.b1 $model.w2 $model.b2]
}
def model_forward_pass [
--model: record<w1: string, b1: string, w2: string, b2: string>
]: [string -> string] {
torch mm ($model.w1 | torch t) # Matrix multiplication with input and first layer weights
| torch add $model.b1 # Add bias for first layer
| torch maximum ([0.0] | torch tensor) # ReLU activation
| torch mm ($model.w2 | torch t) # Matrix multiplication with second layer weights
| torch add $model.b2 # Add bias for second layer
}
def cross_entropy_loss [
--logits: string # tensor id of model outputs
--targets: string # tensor id of target labels
]: [nothing -> string] {
let logp = $logits | torch log_softmax --dim 1
# print $"logp: ($logp | torch mean | torch value)"
let loss = $logp | torch gather 1 ($targets | torch unsqueeze 1) | torch squeeze 1 | torch mean | torch neg
$loss
}
def train [
--model: record<w1: string, b1: string, w2: string, b2: string>
--X: string # Input tensor id
--y: string # Target tensor id
--epochs: int = 1000
--lr: float = 0.1
--record_every: int = 100
]: [nothing -> record<model: record<w1: string, b1: string, w2: string, b2: string>, losses: list<number>, steps: list<number>>] {
mut losses: list<number> = []
mut steps: list<number> = []
let ps = model_get_parameters --model $model
for epoch in (seq 0 ($epochs - 1)) {
let logits = $X | model_forward_pass --model $model
let loss = cross_entropy_loss --logits $logits --targets $y
for p in $ps {
$p | torch zero_grad
}
$loss | torch backward
torch sgd_step $ps --lr $lr
if ($epoch + 1) mod $record_every == 0 {
$losses = $losses | append ($loss | torch value)
$steps = $steps | append ($epoch + 1)
print $"epoch: ($epoch + 1)/($epochs), loss: (($loss | torch value | math round --precision 4))"
}
}
return {
model: $model
losses: $losses
steps: $steps
}
}
def plot_raw_data [res: record<X: string, y: string>] {
# Call with named arguments (flags)
let X: string = $res.X
let y: string = $res.y
let X_value = $X | torch value
let y_value = $y | torch value
[
{
x: ($X_value | enumerate | each {|xy| if ($y_value | get $xy.index) == 0 { $xy.item.0 } })
y: ($X_value | enumerate | each {|xy| if ($y_value | get $xy.index) == 0 { $xy.item.1 } })
}
{
x: ($X_value | enumerate | each {|xy| if ($y_value | get $xy.index) == 1 { $xy.item.0 } })
y: ($X_value | enumerate | each {|xy| if ($y_value | get $xy.index) == 1 { $xy.item.1 } })
}
{
x: ($X_value | enumerate | each {|xy| if ($y_value | get $xy.index) == 2 { $xy.item.0 } })
y: ($X_value | enumerate | each {|xy| if ($y_value | get $xy.index) == 2 { $xy.item.1 } })
}
] | beautiful scatter | merge deep {layout: {title: {text: "Raw Data"}}} | termplot
}
def plot_loss [
--losses: list<number> # list of loss values
--steps: list<number> # list of steps (epochs) corresponding to losses
] {
[{x: $steps y: $losses}] | beautiful lines | merge deep {layout: {title: {text: "Loss"}}} | termplot
}
def plot_results [
--X: string # Input tensor id
--y: string # Target tensor id
--model: record<w1: string, b1: string, w2: string, b2: string> # Model parameters
]: [nothing -> nothing] {
let Xl = $X | torch detach | torch value
let yl = $y | torch detach | torch value
# let yscaledl = $y | torch div ($y | torch max) | torch detach | torch value
let x_min = ($Xl | each {|x| $x | get 0 }) | math min
let x_max = ($Xl | each {|x| $x | get 0 }) | math max
let y_min = ($Xl | each {|x| $x | get 1 }) | math min
let y_max = ($Xl | each {|x| $x | get 1 }) | math max
let xs = torch arange $x_min $x_max 0.1
# let ys = torch arange $y_min $y_max 0.1
let ys = $xs
let mesh = torch stack [
($xs | torch repeat ($ys | torch value | length))
($ys | torch repeat_interleave ($xs | torch value | length))
] --dim 1
let logits = $mesh | model_forward_pass --model $model
let Z = torch argmax $logits --dim 1 | torch reshape [($xs | torch value | length) ($ys | torch value | length)]
beautiful plot
| beautiful add contour {
x: ($xs | torch value)
y: ($ys | torch value)
z: ($Z | torch value)
colorscale: (beautiful colorscale 3)
opacity: 0.4
}
| beautiful add scatter {
x: ($Xl | enumerate | each {|xy| if (($yl | get $xy.index) == 0) { $xy.item.0 } })
y: ($Xl | enumerate | each {|xy| if (($yl | get $xy.index) == 0) { $xy.item.1 } })
}
| beautiful add scatter {
x: ($Xl | enumerate | each {|xy| if (($yl | get $xy.index) == 1) { $xy.item.0 } })
y: ($Xl | enumerate | each {|xy| if (($yl | get $xy.index) == 1) { $xy.item.1 } })
}
| beautiful add scatter {
x: ($Xl | enumerate | each {|xy| if (($yl | get $xy.index) == 2) { $xy.item.0 } })
y: ($Xl | enumerate | each {|xy| if (($yl | get $xy.index) == 2) { $xy.item.1 } })
}
| merge deep {layout: {title: {text: "Model Predictions"}}} | termplot
}
let raw_data = generate_data --n_samples 300 --centers 3 --cluster_std 0.7 --skew_factor 0.3
plot_raw_data $raw_data
let net = model_init --input_size 2 --hidden_size 20 --output_size 3
let model_res = train --model $net --X $raw_data.X --y $raw_data.y --epochs 3000 --lr 0.1 --record_every 100
plot_loss --losses $model_res.losses --steps $model_res.steps
plot_results --X $raw_data.X --y $raw_data.y --model $model_res.model
Appendix B: Python Neural Network Example
import torch
from typing import List, Tuple, Dict
import matplotlib.pyplot as plt
torch.manual_seed(42) # reproducibility
def generate_data(
n_samples: int = 300,
centers: int = 3,
cluster_std: float = 0.7,
skew_factor: float = 0.3,
) -> Tuple[torch.Tensor, torch.Tensor]:
n_per_class = n_samples // centers
X_parts, y_parts = [], []
blob_centers = [
torch.tensor([0.0, 0.0]),
torch.tensor([3.0, 0.0]),
torch.tensor([1.5, 2.5]),
]
for i in range(centers):
pts = torch.randn(n_per_class, 2) * cluster_std + blob_centers[i]
if i in (1, 2):
skew = torch.tensor(
[[1.0, skew_factor * (i - 1)], [skew_factor * (i - 1), 1.0]]
)
pts = torch.mm(pts - blob_centers[i], skew) + blob_centers[i]
X_parts.append(pts)
y_parts.append(torch.full((n_per_class,), i, dtype=torch.long))
return torch.cat(X_parts), torch.cat(y_parts)
Model = Dict[str, torch.Tensor]
def model_init(inp: int = 2, hid: int = 20, out: int = 3) -> Model:
return {
"w1": torch.randn(hid, inp, requires_grad=True),
"b1": torch.randn(hid, requires_grad=True),
"w2": torch.randn(out, hid, requires_grad=True),
"b2": torch.randn(out, requires_grad=True),
}
def model_get_parameters(model: Model) -> List[torch.Tensor]:
return [model["w1"], model["b1"], model["w2"], model["b2"]]
def model_forward_pass(model: Model, x: torch.Tensor) -> torch.Tensor:
w1t = model["w1"].t()
x = torch.mm(x, w1t) + model["b1"]
x = torch.max(torch.tensor(0.0), x) # ReLU
w2t = model["w2"].t()
x = torch.mm(x, w2t) + model["b2"]
return x
def cross_entropy_loss(logits: torch.Tensor, targets: torch.Tensor) -> torch.Tensor:
logp = torch.log_softmax(logits, dim=1)
# print(f"logp: {logp.mean()}, targets: {targets.shape}")
chosen = torch.gather(logp, 1, targets.unsqueeze(1)).squeeze(1)
return -chosen.mean()
def sgd_step(ps: List[torch.Tensor], lr: float = 0.1) -> None:
"""
Vanilla gradient descent: p ← p - lr * p.grad , then reset gradients.
Operates in-place; returns nothing.
"""
with torch.no_grad():
for p in ps:
if p.grad is not None:
p -= lr * p.grad
def train(
model: Model,
X: torch.Tensor,
y: torch.Tensor,
epochs: int = 1000,
lr: float = 0.1,
record_every: int = 100,
) -> Tuple[List[float], List[int]]:
losses, steps = [], []
ps = model_get_parameters(model)
for epoch in range(epochs):
# forward & loss
logits = model_forward_pass(model, X)
loss = cross_entropy_loss(logits, y)
# zero existing grads, back-prop, SGD update
for p in ps:
if p.grad is not None:
p.grad.zero_()
loss.backward()
sgd_step(ps, lr)
if (epoch + 1) % record_every == 0:
losses.append(loss.item())
steps.append(epoch + 1)
print(f"epoch {epoch+1:4d}/{epochs} loss {loss.item():.4f}")
return losses, steps
def plot_raw_data(X: torch.Tensor, y: torch.Tensor) -> None:
Xl, yl = X.tolist(), y.tolist()
plt.scatter([p[0] for p in Xl], [p[1] for p in Xl], c=yl, alpha=0.8, cmap="viridis")
plt.title("Raw data")
plt.show()
def plot_loss(losses: List[float], steps: List[int]) -> None:
plt.plot(steps, losses)
plt.title("Training loss")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.show()
def plot_results(X: torch.Tensor, y: torch.Tensor, model: Model) -> None:
Xl = X.detach().tolist()
yl = y.detach().tolist()
x_min = min(p[0] for p in Xl) - 1
x_max = max(p[0] for p in Xl) + 1
y_min = min(p[1] for p in Xl) - 1
y_max = max(p[1] for p in Xl) + 1
xs = torch.arange(x_min, x_max, 0.1)
ys = torch.arange(y_min, y_max, 0.1)
mesh = torch.stack([xs.repeat(len(ys)), ys.repeat_interleave(len(xs))], dim=1)
# note: do not use no_grad here for easier translating to nushell
logits = model_forward_pass(model, mesh)
Z = torch.argmax(logits, dim=1).reshape(len(ys), len(xs))
plt.contourf(xs, ys, Z, alpha=0.4, cmap="viridis")
plt.scatter([p[0] for p in Xl], [p[1] for p in Xl], c=yl, alpha=0.8, cmap="viridis")
plt.title("Decision boundary")
plt.show()
if __name__ == "__main__":
X, y = generate_data(n_samples=300, centers=3, cluster_std=0.7, skew_factor=0.3)
plot_raw_data(X, y)
net = model_init(inp=2, hid=20, out=3)
losses, steps = train(net, X, y, epochs=3000, lr=0.1, record_every=100)
plot_loss(losses, steps)
plot_results(X, y, net)