Gather your Zotero files into one corpus – a use-case with Nouswise (NotebookLM alternative)

4 min read Original article ↗

Made with ComfyUI and GIMP by the author.

I don’t want Google poking around but I do want NotebookLM – what can I do?

Want in on NotebookLM? – Sign up with your Google account. Upload to NotebookLM and Google grabs it. And I don’t want that. Also, I’m looking for “NotebookLM”-features without forcing a Google account on to others as I want to experiment with it to see if it is a good fit for my students. And to evaluate if it is helpful for me planning lessons and research. Nor do I want my personal Google-account integrated with my work. So, Nouswise could be an alternative. Other services exists but better get started with one and see how it works instead of going through endless lists of “Top-five NotebookLM alternatives”. However, I encountered a challenge pretty fast. I wanted to gather all my articles on statistics from my Zotero library to Nouswise. But each file is stored in a unique folder and the manual labour of going trough one folder per file and upload it is not something I want to do. Gathering all PDFs into one corpus would be nice if I decide to make a homebrew RAG down the line as well. Hence this tutorial! FYI: I’m on a Linux PC. Oh, and don’t forget to change USER to your specific circumstances!

1. Get a Nouswise account

Nothing to it, pretty standard procedures. Just go to nouswise.com and sign up.

Once in, start a new project from the Home icon in the top left corner.

Create your folder to organize all your documents.

2. Gather your Zotero library

Right-click and export the folder you want from Zotero.

Choose BiBTex and export to a new folder. I named mine Statistics_lib and placed it in my HOME directory. The problem is that each PDF is stored like this:

/home/USER/Statistics_lib/files/1548/*.pdf
/home/USER/Statistics_lib/files/5191/*.pdf
etc.

The numbers represent a unique folder per file. I want all my PDFs in one place for a smooth upload to Nouswise.

Note: working from and in correct directories is crucial! Be sure to use correct directories (“adresses”).

Make a directory for where you want to gather all your PDFs. Enter your {bash} commands into the terminal:

mkdir /home/USER/Statistics_lib_new

See if you can find the files you want to move first as a dry-run:

find /home/USER/Statistics_lib/files/ -name "*.pdf" -print0 | xargs -0 echo

You should get a long list of all the PDFs. Next thing, and this is a mouthful, relocate them.

sudo find /home/USER/Statistics_lib/files/ -name "*.pdf" -print0 | xargs -0 cp --update=none -t /home/USER/Statistics_lib_new/

And now for the moment of truth:

ls -l /home/USER/Statistics_lib_new

🎉 Success! 🎉

You now have all the files in one place!

3. Upload to Nouswise

From here on it is tinkering within Nouswise. Upload your files (there is a cap of 40 files at once so go easy) from ´~/Statistics_lib_new` to your project folder on Nouswise.

Happy exploring with your new “NotebookLM”!

…honestly though I wasn’t that happy with Nouswise UI but I’ll give it a try 🤷🏼! And there is the caveat that it runs on OpenAI LLMs, so you are not of the hook when it comes to data integrity. Keep that in mind when you upload your documents. Yeah, I might start cooking on that RAG now…

Bonus material:

Let’s have Mistral-Small3.2 explain the magical sentence in this whole post:

ollama run mistral-small3.2:latest >>> Explain this bash: sudo find /home/USER/Statistics_lib/files/ … -name “*.pdf” -print0 | xargs -0 cp –update=none -t /home/USER/Statistics_lib_new/

This bash command is used to copy all PDF files from one directory to another. Let me break it down for you:

  1. sudo: This runs the command with superuser privileges, which may be necessary if you don’t have permission to read the source files or write to the destination directory.

  2. find /home/USER/Statistics_lib/files/ -name "*.pdf" -print0: This finds all files with the .pdf extension in the specified directory and prints them with a null character (-print0) as a separator between filenames. The null character is used instead of a newline to handle filenames that might contain spaces or special characters.

  3. |: This is a pipe that takes the output of the find command and passes it as input to the next command.

  4. xargs -0: This takes the null-delimited input from the pipe and constructs arguments for the next command. The -0 option tells xargs to expect null-delimited input.

  5. cp --update=none -t /home/USER/Statistics_lib_new/: This copies the files found by find to the specified destination directory. The --update=none option tells cp to copy all files unconditionally (it won’t skip files that already exist in the destination). The -t option specifies the target directory.

In summary, this command copies all PDF files from /home/USER/Statistics_lib/files/ to /home/USER/Statistics_lib_new/ without checking if the files already exist in the destination.