Settings

Theme

TextAnalysisTool.NET

textanalysistool.github.io

104 points by gadiyar 2 years ago · 44 comments

Reader

totetsu 2 years ago

https://lnav.org/

Is a good Linux command line tool in the same genre

  • AdieuToLogic 2 years ago

    > Is a good Linux command line tool in the same genre

    It is also a good OS-X/FreeBSD command line tool as well.

    • totetsu 2 years ago

      (Not .NET I should say ;))

      • AdieuToLogic 2 years ago

        > (Not .NET I should say ;))

        Perhaps it might be[0]?

        I'm not quite sure how I feel about that however... :-D

        0 - https://learn.microsoft.com/en-us/dotnet/core/install/

        • totetsu 2 years ago

          Everyone has autonomy over their own system. Who are we to judge if they want to do something like that.

          • delta_p_delta_x 2 years ago

            > do something like that

            What's wrong with this? Genuine question.

            Modern .NET is fully open-source with a permissive MIT licence. This includes the compiler and analysers infrastructure (Roslyn), the package manager (Nuget), and even the shell language (PowerShell).

            It is a superb alternative to Java, Go, and similar languages. Why is using .NET on Linux or MacOS such a weird thing?

            • luismedel 2 years ago

              Agree. While I completely understand and respect anyone's reasons to not use any piece of tech, always we talk about .net I find funny that the main reason to not use it is that "it comes from M$$$$44". That's all the technical analysis.

              In my experience, is as great as any other backend stack for UNIX. But, hey! If anyone wants to ban a piece of software on their systems for whatever random reason, they're free to do that. Luckily they have tons of alternatives from companies with great sense of ethics (Go, Swift, Java, ...)

            • HeckFeck 2 years ago

              Developing a .NET CRUD webapp on Mac using Rider, ASP.NET + EF Core and PgSQL.

              Deploying on Debian Linux behind a reverse proxy.

              It's a comfy life. Everything (more so than the Java ecosystem) just works.

              That said, I still have that nagging fear that Microsoft will do a Microsoft in some way and I'll be forced back onto Windows with all its attendant horrors.

              • leosanchez 2 years ago

                This[1] comes under "doing a Microsoft right" ? It is my belief that M$ is intentionally not improving dotnet watch.

                The current implementation is not as good as what you get from javascript world.

                1. https://www.theverge.com/2021/10/22/22740701/microsoft-dotne...

                • HeckFeck 2 years ago

                  https://isdotnetopen.com/

                  Some other concerns are raised here. E.g. I wasn't aware that the debugger was licenced restrictively and not under the same permissive licence as the rest of .NET Core.

                  Well I am quite far into this project using .NET. Hopefully it doesn't get worse.

              • donny2018 2 years ago

                I have a feeling that Microsoft has “let go” of Windows. It has expanded far beyond it and no longer even needs to depend on it as a separate revenue stream.

            • hulke 2 years ago

              As far as I know the .net debugger infrastructure is not open source, so unless you are happy to stay within the confines of VSCode, I think that your options are pretty limited for stepping through your code.

              Running .net code on Linux is fine, though.

              • delta_p_delta_x 2 years ago

                > the .net debugger infrastructure is not open source

                This is vsdbg (which comes with Visual Studio 2022 and the Microsoft-provided binary of VS Code).

                There are alternatives like OmniSharp[1], the debugger shipped with JetBrains Rider, and Samsung's netcoredbg[2].

                [1]: https://github.com/OmniSharp

                [2]: https://github.com/Samsung/netcoredbg

                • hulke 2 years ago

                  Good to know!

                  I've been using Omnisharp for ages, but I could never get netcoredbg working.

                  Last time I tried te integration with emacs-dap it kept segfaulting for no obvious reason.

                  But I may give it another try now!

              • otaconjh 2 years ago

                i'm currently developing .net apps in neovim with full LSP support, it's lovely. There are definitely alternative debuggers, such as netcoredbg from Samsung

            • nazgulsenpai 2 years ago

              I tried debugging my .net project in VS Codium and can't because the debugger will only run in Microsoft's Visual Studio Code. That was enough to make me second guess putting my eggs in that basket.

  • akoboldfrying 2 years ago

    Looks nice! Especially the SQL query feature.

dash2 2 years ago

This is grep++ right?

My guess is that it's aimed more at the humanities. Hence the GUI. My experience: in the world of humanities text analysis, there are just a ton of Java programs which were funded by some academic grant. Mostly they are closed source, not updated, might have a horrible GUI, and the website is always written in 8 point font.... Don't hate them for what they are....

keithnz 2 years ago

reminds me a bit of klogg https://github.com/variar/klogg which is more for log files and based off glogg which went dead. it has nice filtering and highlighting type stuff. It's great for live views of log files.

internetter 2 years ago

Marginally related, but this is one of the things I'm bullish on ChatGPT for. Too frequently, I've gotten hundreds of lines of malformed textual data that I need to standardize. This is like impossible with REGEX but I can drop it into GPT and it does this wonderfully.

  • BurnerBotje 2 years ago

    There is however no indication if it failed on a line when using ChatGPT, it could provide you with a slightly incorrect result.

    • internetter 2 years ago

      Yeah that's always been a fear but I always dog food and I've had no issues yet

      • RugnirViking 2 years ago

        ive done it with transforming data (for example pasting a table in and asking it to turn it into LaTeX) or something and had the occasional issue with it misordering or forgetting things. It didn't take long to spot the error for me though

    • ukuina 2 years ago

      You could run it through thrice with a different prompt/temperature/model and pick the majority result (or exit with success on the first two passing runs).

      • akoboldfrying 2 years ago

        Good idea. If the data is a list of records where the order isn't important, randomly permuting them (ETA: then sorting the final outputs) would be another option.

        ETA2: Would the downvoter care to explain why? Genuinely puzzled.

  • osigurdson 2 years ago

    I have no idea how Regex became the standard. The syntax is impossible to remember unless you write regex expressions daily. Most people only rarely need regex so it needs to be relearned every time. It is also incredibly unsatisfying to write (and read).

    • sgc 2 years ago

      I used it a lot for a few years decades ago, and only use it rarely now. I remember the syntax well and I am not know for having a great memory. I think its terseness suits the extreme focus of its use perfectly.

    • pipeline_peak 2 years ago

      As much as I hate to hop on the AI bandwagon, this is definitely where tools like Chat GPT shine.

      Not to mention, non tech people will now be able to use what once could’ve only been done with cryptic regex.

    • broodbucket 2 years ago

      The only good thing to come out of regular expressions is https://regexcrossword.com/

    • ukuina 2 years ago

      What would you replace regex with?

  • dextro42 2 years ago

    I tried using ChatGPT (4) for format conversion. I had a draft yaml file and needed some differently structured json. Mainly with the same content.

    If you just want to change the format it works. If you need more than programming skills it seems too fail duo to the amount of text.

    E.g. if you have a list of items and want ChatGPT to generate a meta field which it cannot generate using simple python code it stops after 10 to 20 elements.

    Thus at least the cloud version doesn't work so well here.

    I also wanted it to help me fill out my i18n file with translations and plural forms. Even thought he got every word correct i needed to split it into multiple requests. Not sure if the api would have worked better (used the web frontend).

    For the plural forms I finally added them myself as it was way faster for my natural language than copy pasting all the small chunks. Really hoped for more help there.

    • fhaeberle 2 years ago

      hey, if you are search for really seamless i18n with nice DX, check out https://inlang.com – js library, web editor, automation cli & vs code extension are just some of the completely free and open source offerings

  • Liftyee 2 years ago

    Agreed. It works especially well for formatting where semantics matter, such as separating the term and definitions of flashcards. Hard to do with code, but easy with GPT.

atesti 2 years ago

Where is the source code? It looks like they only host releases on github, but the license is MIT

  • bramblerose 2 years ago

    The MIT license just gives you permission to use the work as published. Normally that work would be in source form, but there is nothing in the MIT license requiring that. In this case, it seems that the authors chose to release the binaries under the MIT license.

brchn 2 years ago

This tool is pretty good. Used it to find the meaningful errors from giant MSBuild logs

  • sebazzz 2 years ago

    Whenever convenient the MSBuild binary log file in combination with MSBUILD Structured Log Viewer is a better fit.

andix 2 years ago

If I had to chose a name for it, it probably would be "Regex 401".

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection