Show HN: Codemap – Codebase Visualizer for JavaScript, TypeScript, and Python

84 points by ru6xul6 6 years ago · 40 comments

Reader

ru6xul6OP 6 years ago

Hey HN, I'm the creator of Codemap.

Codemap is a codebase visualizer that displays the structure of function calls of any Javascript, Typescript, or Python code. Given a local repo, Codemap statically parses the code and renders a directed graph of all function calls in the least cluttered layout. It helps programmers familiarize with a new codebase, trace possible scenarios that lead to a bug in a function, or understand the scope of impact when making a code change. Codemap runs offline on your local machine and never sends any sensitive code data to remote servers.

I built Codemap because I noticed that good engineers always spend quality time understanding the code architecture before making changes. This process is crucial, and it usually requires drawing a function call diagram on a whiteboard or in the head. Some people use IDE "Find usage of function..." search box for a million times, some people ask senior engineers for tribal knowledge, and some just get lazy and start writing code with ignorance.

As a software engineer, I think there's a better solution than this tedious and error-prone practice. I looked at existing code visualization tools and felt that the visualization tends to be overly cluttered and not that user-friendly. I believed that, with some level of design, I can build a tool that is pleasant to use, easy to navigate, and versatile for any codebase in the supported languages.

During the development of Codemap, I would sometimes grab a popular Github repo to test the Codemap app, and I often find myself exploring its call graph for hours to satisfy my genuine curiosity of "How does <popular project> work?" I'd encourage curious HN readers to visit the website and see screenshots of call graphs from React, Keras, Django, and Typescript (yep the one from Microsoft).

I would love to hear what you love and hate, as well as any questions you may have. I'll be here to respond the best I can (PST time). Thank you :)

divan 6 years ago

Hey, great stuff, but my question is whether produced visualization/graph is similar to how would you "see" the code in your head yourself?
The problem of familiarizing/navigating textual codebase is embarrassingly unsolved, and I was exploring this in the post "Rethinking visual programming with Go" [1], but yet my main issue with visualization tools is that they offer intermediate representation of the code, which still has to be "decoded" into proper mental map.
I see the future of these visualizaton approaches in AR/VR. Code visualization should match mental map visualization as much as possible, and the better we can use spatial tools for representation the better the result.
[1] https://divan.dev/posts/visual_programming_go/
- amw-zero 6 years ago
  
  I think a graph is how we perceive code, I just think the human brain has a very unique way of comprehending graphs. It seems to me like the graph is “indexed” in multiple different ways, so that multiple paths exist between nodes. The brain is also better at categorizing nodes - e.g. “fetch everything that I know about colors in general.”
  I definitely think of code in graphs, but it’s definitely beyond just a static graph diagram.
  - noir_lord 6 years ago
    
    Graphs and tags for me, It's hard to describe but I can picture parts of the codebase I've worked with but only roughly until I need to interact with them then I can hold those details in my head but I lose that clarity somewhere else - I think it's a cognitive limit I have, I don't know where that lands against other developers and to an extent I think it can be trained.
    As a lead/senior the thing I've observed as the crucial difference between say someone with a year or twos experience vs someone at my point is how fast I can get up to speed with a codebase I've never seen/touched.
    Partly I think it's knowing how to use the tools better to get the answers I need and partly pure cyniscism, I've learnt to turn off the "if this was done sensibly" bit in my head so I can actually see what it's doing over what it probably should have actually been doing.
    One of the things I really focus on with my juniors is how to use the tools available (and an A4 pad and pen) to figure out what the system is actually doing.
    
    ru6xul6OP 6 years ago
    
    Can't speak for you, but I think a lot of that comes from experience, e.g. put out fires caused by whatever stupid reason, and eventually build up awareness of what could go wrong. Therefore, without reading too much into the code, simply being aware of their existence becomes largely helpful.
  - ru6xul6OP 6 years ago
    
    Agreed that our brain comprehension goes far beyond just a static graph. Ultimately I want to see all functions as fluid, which can answer dynamic questions like "show me the journey from this input data to that database update query". I think namespace/module/class/method are all valid ways to help organize this mental understanding, but they're also in a rigid form and failed to capture the dynamic nature. I think a 2D graph like Codemap opens up the rigidity a bit, but not entirely yet.
    
    amw-zero 6 years ago
    
    The path you describe ("input data to database query") is most certainly representable by a graph. I'm not sure what you mean by "static graph." I do believe our memory is persistent, "static," and finite. We hold discrete pieces of information and connections between them. Aka, a graph.
- ru6xul6OP 6 years ago
  
  Hey Ivan, thanks for sharing the great article! I actually read it last year and left a comment (name is Chentai). Back in 2019 I built an IntelliJ plugin [1] as a prototype for Java, and now I'm building Codemap as a natural follow-up. Glad to reconnect :) I definitely share your sentiment towards visual programming. AR/VR may be a better vehicle to convey the code structure, but I think 2D call graph, if done well, could serve as a valuable stepping stone for now. Hopefully Codemap can be the tool that inspires future design and thoughts.
  [1] https://plugins.jetbrains.com/plugin/12304-call-graph
  - divan 6 years ago
    
    That's awesome! I remember that project, yes :)
    BTW, there is a community for people exploring new ways of coding - https://futureofcoding.org. You might want to join it, if not yet.
    
    ru6xul6OP 6 years ago
    
    Nice, another rabbit hole for the weekend. Thanks for sharing!
- nefitty 6 years ago
  
  Your caveman-with-a-torch analogy blew my mind.
  - ru6xul6OP 6 years ago
    
    Totally agreed! I read his article last year and this analogy is still vivid in my head.
dnt404-1 6 years ago

I was thinking about a tool as such and wanted to build after I finished my Master's when I would have time (as I work full-time). The experience you described is exactly why I wanted to build such a tool in the first-place. And, this is going to be really useful.
However, is there a Windows version? That is, unfortunately, what I have to use at work.
- ru6xul6OP 6 years ago
  
  Sorry for not supporting Windows yet. As I'm less familiar with the Windows toolchain, it takes me a little longer to port to Windows. If you don't mind, please feel free to shoot me an email (in my profile), so I can notify you once it's released.
- ru6xul6OP 6 years ago
  
  Hey, the Windows version is ready. Please visit the download page [1] to try it out :)
  [1] https://codemap.app/download
techsin101 6 years ago

Can you black box framework code. How does it work with react context or redux
- ru6xul6OP 6 years ago
  
  Only functions that are defined in your codebase will be rendered in the graph. Thus framework functions are excluded.
lgregg 6 years ago

Can this be used inversely to find dead code?
- ru6xul6OP 6 years ago
  
  Absolutely. All those individual nodes linked to nothing else are good suspects.
indentit 6 years ago

Looks great - how does it work? Or, what is needed to add new language support like C#? Pricing seems very reasonable.
- ru6xul6OP 6 years ago
  
  Glad you like it! It uses a language-specific parser to generate function dependencies data, e.g. list of function calls, which are then consumed and visualized by shared UI components that renders the graph. To build language parsers, I started with Language Server Protocol (LSP) [1], but soon realized that some LSP libraries are outdated and I should go deeper to the core language parser, e.g. using Typescript tsserver for JS/TS code. To support a new language, I'll need to integrate a good language parser for that language, generate function dependencies, then the UI and graph rendering will take care of the rest.
  [1] https://microsoft.github.io/language-server-protocol/

avery42 6 years ago

This looks cool! Pricing looks reasonable.

My main piece of feedback is that it's still a bit difficult to tell what this does, and more specifically how it solves the problem of understanding a codebase. For example, how well does it understand my code (does it use language servers?), are there any views other than the tree, can I jump into a file and choose a function, etc

Even if I download it to try it out, the 100 function limit on the free plan seems like it would be quite limiting to trying the tool as part of my workflow (I think a codebase that I have difficulty navigating tends to be larger than that).

The tool is cheap enough that I wouldn't mind skipping the free plan entirely, but it doesn't seem like there's a way to purchase just one month.

One question about the performance side, is the app native, and how massive of a codebase can it handle while staying snappy?

ru6xul6OP 6 years ago

Thanks for the feedback and great questions! Let me address your questions below.
- You nailed it, Codemap uses language servers (Typescript for TS/JS, Jedi for Python) to understand the code.
- Tree view is currently the only layout that's available. If you have any thoughts on this please let me know!
- It doesn't support jumping into a file a choose a function, but it supports a few interactions that may be as useful. A few examples are: 1. zoom in/out, either horizontally or vertically 2. limit render scope by folder and files (useful to keep performance up and avoid cluttering) 3. showing only nodes that are upstream or downstream of a particular function (useful for tracing the scope of impact) 4. quick code preview for node (function definition) or edge (function call)
- You're right, the 100 function limit is a little harsh. The goal is for people to try out the tool with a few files and get a taste of it. The price is pretty affordable though, as I want Codemap to serve the global programmers.
- There's currently no option to purchase just one month, but I'd definitely consider it!
- This is an Electron app (please don't scream :p) with the goal of market verification and gathering early feedbacks. If this app solicits enough interest, I'll seriously consider re-building a native app. That said, from my experience Codemap remains snappy up to 2000 functions that are moderately interconnected, at which point the issue of cluttering outweighs its slowness, and limiting the rendering scope (folders/files) becomes necessary.
Again, thanks for the thorough feedback! Let me know if you have more questions.

duutfhhh 6 years ago

Some time ago I tried building something like this too, but lost interest at the end. Feel free to pick ideas if you find them interesting:

http://iswaac.dev/blog/2019-12-14/

The basic idea is to use nested boxes to represent ownership and lines to represent dependencies. I only did the first part to some extent: parsed JS and rendered the AST into a treemap using WebGL.

What I actually envisioned in my mind is a 3d nested boxes rendered in a "galaxy style" where you can fly from one box to another, click on boxes to show their sub-boxes and see the lines that represent connections. The 3d space would allow to unclutter the image.

Edit1. While I think that nested boxes is the natural way to represent ownership, it's often difficult to figure this ownership from the code. For example, class A is really the owner of B and C, but those are given to A as args in the constructor, so to an unsophisticated analyzer tool, B and C would be just deps. Perhaps the tool should allow the user to adjust the relationship status between objects based on some a priori knowledge. The changes would need to be saved, so next time the diagram is rebuilt from code, the visualizer would remember that A owns B and C. I believe this would be really useful in big companies that struggle to keep such diagrams up to date.

Edit2. I'm reading my own note about WebGL and matrix projections and laughing: I wouldn't write that nonsense with my current understanding of webgl.

ru6xul6OP 6 years ago

Awesome project! I particularly love the AssemblyScript visualization resembling a city view. Thanks for sharing it! I'm not sure I follow the definition of ownership, and maybe you can help me out here. Does ownership mean that, if class A owns property B and C, iswaac renders box A enclosing box B and C? How does it apply to function relationship when it's not a strictly nested tree? This reminds me of Firefox 3D view [1] of HTML DOM elements, which is another handy visualization that's unfortunately discontinued. I think it helps frontend engineers fighting against spaghetti HTML/CSS code.
[1] https://developer.mozilla.org/en-US/docs/Tools/3D_View
- duutfhhh 6 years ago
  
  It's the usual "A contains B" relationship. At the top level, the entire project contains everything else, so there is one big box with many boxes inside. Usually, the next level is folders, then files, then classes and methods. But at the classes level, it gets tricky. There's syntax hierarchy, when class A contains method B, and there's logical hierarchy, when we know that in a working app, class A is the owner of class B, meaning that an instance of A creates, manages the state and destroys an instance of B. The visualizer needs to somehow account for this or let the user specify this relationship.
  I think that a 3d webgl visualizer would be cool, but a graphviz style diagram generator would make money.

rswerve 6 years ago

This looks cool. Indexing node_modules feels like the wrong default. I waited several minutes with cpu at 100 after startup for a fairly small React project.

And maybe the free tree could start at a more useful place? index.js or App.jsx or App.vue? In my case, the 100 functions are all instrumentation and polyfills, which doesn't give me much sense of the potential utility.

In a Django project, I see...nothing.

Good luck! This seems like it could be useful.

ru6xul6OP 6 years ago

Thanks, points well taken! I think parsing .gitignore plus some basic exclusion like node_modules would be a good start. For the free version, you're totally right that it should pick 100 functions that are more interesting to play with. To avoid assuming the project structure like App.vue, I'll probably prioritize showing connected components with the most nodes. Sorry for the issue on Django projects, I'll test it out and see what's going on there.

heliodor 6 years ago

How do you determine whether python is installed?

I'm getting this error on OSX: > Please make sure Python3 is installed on your computer.

My python3 installation works fine and is located at: /usr/local/bin/python3

ru6xul6OP 6 years ago

It checks whether `python3 -V` ends successfully in a Node JS child process. Is it possible that `python3` is not available in your environment? Or perhaps `python3` isn't linked properly? If you use brew, `brew link python3` could be helpful.

craz 6 years ago

I’ve been using SourceTrail[0] since they open sourced it to get my head around a large Java project. It doesn’t yet support TypeScript projects so I’m really interested to see how Codemap goes on our React app.

[0] https://www.sourcetrail.com/

ru6xul6OP 6 years ago

Yup, SourceTrail is probably my favorite before deciding to build Codemap. Since my day job mostly use Typescript, I have to build this to serve my own need.

helb 6 years ago

I'm unable to get anything other than a message saying "Empty graph" for Python. Does it need to know about venv somehow?

JS works fine.

ru6xul6OP 6 years ago

It should work regardless of venv. Could you shoot me an email (in my profile) so I can follow up with you?

asdfasdfafaf 6 years ago

``` Uncaught (in promise) abort("Cannot enlarge memory arrays. Either (1) compile with -s TOTAL_MEMORY=X with X higher than the current value 16777216, (2) compile with -s ALLOW_MEMORY_GROWTH=1 which allows increasing the size at runtime but prevents some optimizations, (3) set Module.TOTAL_MEMORY to a higher value before the program runs, or (4) if you want malloc to return NULL (0) instead of this abort, compile with -s ABORTING_MALLOC=0 "). Build with -s ASSERTIONS=1 for more info. ```

ru6xul6OP 6 years ago

Sorry about the issue. It seems to be struggling with memory while rendering the graph. Are you rendering a large graph? Could you try a smaller scope by restricting the number of files and folders?

tcrow 6 years ago

I have done this kinda thing to great effect using Doxygen, although this solution adds a bit of dynamism that is really appreciated. Having the call graphs, especially for a new/unfamiliar code base is a great way to gain understanding. Kuddos on the launch, i think a lot of engineers will find this pretty useful once you expand on it a bit.

ru6xul6OP 6 years ago

I really appreciate your kind words! From your experience using a static graph by Doxygen, is there any feature in particular you'd love to have? I'm actively improving this tool based on the awesome feedback from HN, and what you wish for will have a large chance of landing on the next upgrade :)

kissgyorgy 6 years ago

OMG this would have been handy so much. I used to track the call stack by following functions and writing it out to a text file.

ru6xul6OP 6 years ago

Thanks, I have the exact same habits until this :)

Settings

Show HN: Codemap – Codebase Visualizer for JavaScript, TypeScript, and Python

Keyboard Shortcuts