Ask HN: How do you learn new libraries without much documentation?
At work, I have been asked to build a couple of POC's on a new Angular based framework our company purchased.
This is proprietary code for a niche industry so the community isn't as large. I also don't have access to any experts on this software.
- A common issue I face is when I want to import a module( and know that the functionality exists) but don't know what do I call and where can I call it from
eg: import { XYZ } from '<WHERE>'
- I have tried asking questions on their private community but it's pretty dead and no one ever responds
There are some tutorial courses but it's can only take you so far. How do I get better at this framework or at least good enough to build some basic POCs? If the code has tests, I would start by looking at those tests. If it has no tests, then I would slowly try to build tests to document the functionality that I need. In your case being Angular that might be having simple html pages with the smallest module that you need. How to find things? If you're on Windows try AstroGrep http://astrogrep.sourceforge.net/ to quickly search and jump around in the code or in any system I use VS Code for a similar functionality. Also learn to use command line find/grep. The book "Working Effectively with Legacy Code" also helped me be more comfortable navigating and changing large code bases, in a long term view I recommend this book to every developer https://www.amazon.co.uk/Working-Effectively-Legacy-Michael-... Lastly, I would raise this because the company might not be aware they are buying a low quality framework that maybe ticks all the boxes in the contract but is in effect impossible to use by their current developers (you), it might be there's other people with more experience in said niche that might be able to help. In the private community maybe some people would be able to accept a short contract to help train you. Xadoc's advice above is good; unit tests. I work with poorly documented protocols that have been implemented "around the theme of the protocol" by hardware from a variety of suppliers, and this is how we work out its quirks. A battery of unit tests, starting with the simplest functions it offers, and thence upwards into more complicated tests (i.e. chained calls of the presented functions) where we track what internal state we think the system should have at that point in the tests and interrogate it to discover what internal state it really does have. These are exploratory tests rather than unit tests, but your point stands. xadoc's last point is really important. The missing documentation is clearly impacting your productivity: you absolutely should raise this as in issue with more senior devs or management. There are a number of ways to respond, and they should be pleased that you have flagged the issue early. This is good advice. I would also extend this and write out an FAQ / stackexchange for the next engineer at your company who has to go through the same learning curve. Oh boy. I still have nightmares about this. At $former_workplace we had an entire SDK, a few hundred thousand LoCs in total, with basically no documentation whatsoever. The team that wrote most of it had long been laid off. We traded notes on how to do various things but as soon as you went out of whatever module you'd been typically working on, all bets were off. I don't know much about Angular but I think most of these things are pretty much universal: > A common issue I face is when I want to import a module( and know that the functionality exists) but don't know what do I call and where can I call it from Generate call graphs from the source code. It's generally a good bet that the functions at (or near) the top of the call graphs are the ones that you're supposed to call. If the library has automated tests, have a look at those -- it won't give you much information about idiomatic usage, but will at least tell you what parts of the whole thing you're supposed to interface with. Liberally grep through the source code for whatever functionality you're looking for. In the absence of documentation, you'll have to create your own "mental map" of what things there are, and where. Other than that, all I can do is recommend everyone else's generic advice: read the source code, take lots of notes. If there isn't good documentation, then you have to learn the library by studying the source code. I will sometimes create my own notes on libraries as I go through module by module. This is time-consuming to be sure, especially up front. You'll pay a high price today for better control and speed with using the library down the road. As you iterate through each module in the source code, ask yourself what each function or class does, what its purpose is, whether there are any side effects or what sort of state changes occur when a method is called (if any). Look at the library's tests. It's quicker and better for learning functionality than "just read the source code". If it's proprietary and closed and obfuscated then you need to familiarise yourself with reverse engineering toolsets. The corollary to this is, "If there aren't tests, start by writing some unit tests on your own that exercise the library's functionality." maybe its just me and my avoidance of unit testing but often when i tried this it appeared overly complex at setting things up for testing and the code you need is buried in between some dozens tests and their setup. I think its a good idea to look at them but more often then not i did not want to get through this and just tried to understand the code flow by reading the source. Seconding this! Tests are a wonderful way to quickly discovering the properties of the codebase that the authors considered important. 1. learn a good editor. 2. write a script to concatenate all the code files in a folder, separated by filenames. 3. pipe that result to your editor. 4. use your editor's "find" functionality. By reading the entire source code in a single file, you have global knowledge of the entire codebase. All the information is available to you. I suggest you try it before dismissing the idea, as I once did. https://github.com/shawwn/scrap is what I use. `codefiles | grep js$ | xargs merge | ft js` will open all javascript codefiles in vim, in JS mode. `cppfiles | xargs merge | ft cpp` will open all C++ files in vim, in C++ mode. Free yourself from the loop of asking other people for answers. Stop that. Read code. If you limit yourself to "projects that have good documentation," you'll miss out on 90% of the interesting code in the world. I'd recommend just learning ag (or ripgrep), they make searching across projects dead easy and support limiting to specific filetypes. Reading all the code might be a decent approach for small libraries but sometimes it's excessive. This recommendation often leads to confusion. Every time you search for something using ag or rg, you're losing all the context around the code. And yeah, -C 10 is a thing, but it's a shadow of having the actual code in an actual editor. It's remarkable how offensive this idea is to people. It's not like I came up with the idea this morning. I've been reading hundred-thousand line codebases for many, many years this way. It's how I studied and understood the original bitcoin codebase. But, you know, if you really want to be stuck in the loop of "ok, this file calls Foo, let me switch to terminal and search for Foo... Ok, now I'll open that file and read it.. Oh it calls Bar, I'll search for bar..." then feel free. And yeah, an IDE is the antidote. If it's a JS project, use `webstorm .`. For python, use `charm .` Unfortunately `clion .` doesn't seem to work for C++ codebases -- you have to "import" the code first, which is highly annoying and generates extra CMakeList.txt flies. VSCode might be fine and automatic and perfect go-to-definition functionality even for template metaprogramming; I don't know. But I do know that the technique I've described above will work 100% of the time. That's why you use a plugin that integrates ag or ripgrep into your editor so that you can see the context in your editor as you search. Sure, that's fine. Personally, I hate dealing with vim plugin nonsense, and especially with getting vim environments working on a remote server inside tmux. But whatever works. The other reason I don't favor this approach is that it totally ignores the fact that the organization of files and filenames also give you valuable semantic information. If I had a library, that, for some insane reason, decided to scramble its source across a gigabyte of filenames generated by a hash function, then I would definitely choose your approach. But most libraries (even horrible internal libraries) aren't like that. The name and location of files allows one to better understand whether the search result is relevant or not. As a cheap example, with `find` or `ag`, I can easily exclude the `test` folder, so that I don't have to deal with usages in unit tests when I'm trying to understand a piece of library code. But, if I'm modifying that code, being able to see unit test results is valuable, because it lets me know which test files I have to update. With your approach, I get the full list of results, whether I want it or not. Plus, if you want to just jump based on keywords ctags is more than enough and support is built into most decent editors. This sounds like trying to create a poor man's IDE with go to definition / find usages functionality. Is there an IntelliJ product for JS yet? I regularly read and understand 50,000+ line codebases with this technique. Again, I suggest trying it before dismissing the idea. A good IDE is nice, when they work, but this fallback has worked 100% of the time. CLion is for C++. PyCharm is for Python. Webstorm is for JS. But Vim is for everything. To put it differently: how often do you use ripgrep on a large codebase? If the answer is "often," then every time you switch to your terminal, you're losing context about the code. No wonder it's impossible to understand when you're having to read code fragments every few minutes. Read the whole code. No argument against the one code file thing because I think it’s a great idea. But a quick comment on the JetBrains suite: With the right plugins Intellij Ultimate is also for everything (just like Vim uses language plugins) and then you get all the benefits of the modern IDE. Intellisense, search, replace, refactor, find usage, type inference, etc I’m not saying vim cannot do this but Intellij is now my preference for everything and I don’t feel the need for merging everything in one file for analysis because I can jump around to definitions/usages easily. Yeah, that's valid. `idea .` seemed to work occasionally back when I tried. But what I ran into was, you often want to install specific plugins for JS, and specific plugins for Python, etc. On my laptop, the result was that IDEA started taking like ... 4 minutes to fully load a codebase. So I just gave up. But IntelliJ is wonderful in general. Maybe others will have more luck. Why would you switch to the terminal to use grep from Vim? :grep works perfectly well, and is more convenient than search in a flat file (better context for matches, quickfix list instead of n/N nonsense). I haven't used it, but there is WebStorm [1]. Visual Studio Code also also has fairly good jump to definition/find usages, thanks to the fact that it has a typescript compiler built in, which allows it to perform analysis of JavaScript projects as well. There is lots of good advice about how to figure out how it works in this thread. One piece of advice I have is write formal documentation of some form as you figure it out. Share it as widely as possible. If nothing else, it will be very useful for you and your co-workers in the future. It sounds like there is some sort of community you can share it with. Ideally there would be some way to contribute it back the to source of the software for distribution with it, but that often isn't possible with proprietary software. Regardless of who you share it with, it will help establish you as an expert within that group. In addition to helping people, it will likely be good for your career. If you think that this knowledge will be important to your employer then try not to share it too much in an organised and documented way but rather help others on specific issues. This is more effective to establish yourself as the expert and go to person. If you write a comprehensive documentation they others need you less. Yikes. Would you want to work in an environment where all your coworkers acted like this? Your suggestion may be necessary in a cut-throat workplace, but I'd be more inclined to GTFO and work somewhere that my team members actually try to help each other, instead of always acting in their own self interest. Did I suggest not to help? No, on the contrary. Of course you should be helpful, but you should also be smart. You want to be seen as valuable AND as difficult to replace. Help your employer succeed and help yourself succeed at the same time. What would you rather hear in management meetings? "We can't let Bob go, he's the expert on X, everyone goes to him for help and we need him" or "Sure Bob is an expert on X, but he wrote down all he knew so we'll manage". See, it's not about not helping, it's about helping while building and retaining leverage. That's why companies want to encourage "knowledge sharing". It's not to foster a friendly atmosphere, it's to be robust against someone leaving and to prevent someone from having too much leverage. Most experienced engineers know that and tend to be wary when asked to document what they know in details and/or to train others. As to act in one's own self interest, well sorry to be the one to break it to you, but that's how the world works in general and that is especially how the workplace works. And in fact that's how everyone works when there's a choice to be made. The sooner you realise that the better off you will be. Well yeah I guess this is what some people actually do. Personally I would never want to work this way though. And besides, there is probably no-one that would document all they know any way. Simply because no-one has time to write everything down and no-one has the time to read all that text. But aiming for good documentation is still a good thing IMO. One thing I haven't seen people suggesting here yet is to use a repl! Import the thing, and then look at what it provides. If something seems useful, try calling the function/instantiating the class, if it gives you an error message, try with different arguments. Hopefully, you should have some idea of what the library is trying to do, so you should be able to see some functions that look like they accomplish the kinds of things you want. Guess what kinds of arguments they take and try it. If you can't figure that out, jump into the source and figure it out. I find it's much nicer working interactively like this than just reading the source because you can immediately try things out rather than jumping back and forth all the time. Also, some languages like python have a `help()` function that you can call with any class/method/function to get to the docs on it (I can't remember anything like that for javascript, so you might be out of luck there). If you have the source code to this library, `find` and `grep` are your best friends. If nothing else, you should be able to find other usages of the code you're looking to use (or maybe even tests), which will let you know how that code is supposed to be used. The other things to look for are classes, functions, or modules that aren't used by other code in the library. Those tend to be the "top-level" code intended to be called by application code. Seeing how those are structured and what functionality they expose can be a great way to discover functionality that documentation leaves out. - Ping people directly in that community. - Get yourself a notebook (or Google doc, whatever) and thoroughly write down everything you learn. - Walk through the source code methodically, and read the jsdoc/function names wherever possible. Don't read too much into implementation. - Use whatever tools your comfortable for this. Generating call graphs or reading through tests first make a lot more sense than trying to read the entire library. - Start by documenting Hello World and go from there. The other comments offer some good advice. If I'm really desperate, I'll search GitHub for projects that use the library to see how they use it. This is what my priorities would be: 1. Make sure you have a good debugger set up for any existing code. This is to answer the question of exactly how a function behaves, what are the meaning of parameters, etc. You know you're going to be dealing with undocumented functionality, so you need a way to quickly answer your own questions. 2. Someone at your organization is paying for this right? Ask them to pressure the supplier for one-on-one support to answer your questions regarding how to use it. You want someone on a video conference who knows what they're talking about so you're can explain things quickly and not have to write up detailed emails and wait for response. 3. You could try to document it yourself: Locate all the exports and create a list of them. Browse it for key names and concepts. Write down the purpose of each, and their relationships. There might be a lot of these items but it won't be infinite. Even if there are 500, if you do 20 a day you'll be done in 6 weeks and by that time you should have a pretty good picture of what's going on. I use Learning Tests for such situations. https://blog.thecodewhisperer.com/permalink/when-to-write-le...
(not fully read by me but seem to explain this technique very well) Boring answer but your best bet will be reading the source code and document each module's external API. There's a lot of good advice in here about how to work around this situation but the best thing you can do here could just be: email this company and ask them. If you are paying for this proprietary software, especially if you're still in the "evaluating whether we should spend a lot of money on this" phase, you should absolutely push back on them. Ask them all the questions you need, big or small. It's really on them to give you something well documented, and if they don't, they better be willing to answer all your questions about it. I've seen this sort of customer behavior be the catalyst to get companies to actually document their stuff because all their engineers time was spent answering the same questions over and over. If you can't learn by example, which is the inductive process and the one that I am most comfortable with and it sounds like you are too, you need to learn by deduction. For node or Ruby or other pure open source environments there are endless examples on the Internet and when you want to learn you can read 50 of them until they start making sense. When there isn't much documentation, you have to deduce the reasoning that went into the codebase or you may never make progress. It's a slower and more demanding process. On a side note, before the explosion of web content, this was how a lot of programming had to be learned. Maybe talk to/bring in older programmers to help you? When I've been using open source libraries without significant docs I've mostly benefitted from actually reading the code. Assuming you have the source. This can be incredibly varying in complexity. I've found a large Elixir codebase, as a functional paradigm easier to grasp compared to a single library in heavy OOP style in Python. Python not enforcing much structure and this particular library doing a lot of inheritance which complicates the state and modelling in my head a lot.
So it varies a lot per code base and experience. But if you have the code, that's what I'd use. A good navigation tool can really help sort out a foreign code base. I suspect it could take you a few hours (with luck...) to install Sourcegraph but when you succeed it will be worth your time! (and the data should be local to the host where you install it). Found a random article online that shows step by step with pictures [1] (I think is a bit more visual than the canonical docs). Good luck! 1: https://www.techrepublic.com/article/how-to-install-sourcegr... One of the tricks I use, is to write a minimal harness, then inject stimulus, and observe the response. I do this for Bluetooth devices, and I have also used utilities, like REST explorer apps, Bluetooth Explorer, PacketLogger, USB Explorer, Charles Proxy and Wireshark. The drawback is, that I could accidentally codify features subject to change. All that said, I tend to be veeery leery of any dependency. Adding dependencies is a serious issue. If the dependency is badly documented, then that’s a “red flag” that it may not have much of a future. Not directly helpful for your case, but useful in general: In strongly statically typed languages like Haskell, the types can often give you an adequate introduction into a new library. There's quite a few open source Haskell libraries that basically only have type annotations, but no proper documentation. The latter would be better, but the former is already surprisingly useful on its own. Usually I just look into source code. Use 'tree' command to see the folder structure and then pick a file that seems relevant. Then I go to the bottom of the file and work my way up (usually the main entry points are the bottom, depending on the language of course). I don't. If it doesn't have good docs, it's usually not worth using. I know for your situation you need to use that particular library, but if given the choice, the better documented one is usually better to use. 1. Read the source code to understand which methods exist in each module, then document that somewhere myself if necessary. 2. Ensure your IDE has autocomplete so you can step through the suggestions when importing a module or calling a method. Just read the libraries source code, examine the methods and see what they doZ More often than not, look at the code directly in case there's other way. Sometimes there are comments in the code that point to better understanding, if you are lucky. The source is the documentation. So read the source code. If the source is not available or obfuscated make sure you charge per hour. Tests, Comments and Source are the other form of documentation apart from the traditional docs. Just out of curiosity, what is the library you’re using? A combination of intellisence, a repl, and reading the source.