weekly

GitHub All Languages Trending

The latest build: 2024-06-14Source of data: GitHubTrendingRSS

The most customisable and low-latency cross platform/shell prompt renderer


Oh My Posh logo Prompt theme engine for any shell

MIT license badge

Build Status badge

Release version number badge

Documentation link badge ohmyposh.dev

Number of GitHub Downloads badge

This repo was made with love using GitKraken.

GitKraken shield

Join the community

Mastodon badge

Discord badge

What started as the offspring of oh-my-posh2 for PowerShell resulted in a cross platform, highly customizable and extensible prompt theme engine. After 4 years of working on oh-my-posh, a modern and more efficient tool was needed to suit my personal needs.

Support

Swag - Show your love with a t-shirt!

GitHub - One time support, or a recurring donation?

Ko-Fi - No coffee, no code.

Features

  • Shell and platform agnostic
  • Easily configurable
  • The most configurable prompt utility
  • Fast
  • Secondary prompt
  • Right prompt
  • Transient prompt

Documentation

Documentation

Reviews

Thanks

aider is AI pair programming in your terminal


Aider is AI pair programming in your terminal

Aider lets you pair program with LLMs, to edit code in your local git repository. Start a new project or work with an existing git repo. Aider works best with GPT-4o and Claude 3 Opus and can connect to almost any LLM.

aider screencast

Getting started

You can get started quickly like this:

$ pip install aider-chat# Change directory into a git repo$ cd /to/your/git/repo# Work with GPT-4o on your repo$ export OPENAI_API_KEY=your-key-goes-here$ aider # Or, work with Claude 3 Opus on your repo$ export ANTHROPIC_API_KEY=your-key-goes-here$ aider --opus

See the installation instructions and other documentation for more details.

Features

  • Run aider with the files you want to edit: aider <file1> <file2> ...
  • Ask for changes:
    • Add new features or test cases.
    • Describe a bug.
    • Paste in an error message or or GitHub issue URL.
    • Refactor code.
    • Update docs.
  • Aider will edit your files to complete your request.
  • Aider automatically git commits changes with a sensible commit message.
  • Aider works with most popular languages: python, javascript, typescript, php, html, css, and more...
  • Aider works best with GPT-4o and Claude 3 Opus and can connect to almost any LLM.
  • Aider can edit multiple files at once for complex requests.
  • Aider uses a map of your entire git repo, which helps it work well in larger codebases.
  • Edit files in your editor while chatting with aider, and it will always use the latest version. Pair program with AI.
  • Add images to the chat (GPT-4o, GPT-4 Turbo, etc).
  • Add URLs to the chat and aider will read their content.
  • Code with your voice.

State of the art

Aider has the top score on SWE Bench. SWE Bench is a challenging software engineering benchmark where aider solved real GitHub issues from popular open source projects like django, scikitlearn, matplotlib, etc.

aider swe bench

More info

Kind words from users

  • The best free open source AI coding assistant. -- IndyDevDan
  • The best AI coding assistant so far. -- Matthew Berman
  • Aider ... has easily quadrupled my coding productivity. -- SOLAR_FIELDS
  • It's a cool workflow... Aider's ergonomics are perfect for me. -- qup
  • It's really like having your senior developer live right in your Git repo - truly amazing! -- rappster
  • What an amazing tool. It's incredible. -- valyagolev
  • Aider is such an astounding thing! -- cgrothaus
  • It was WAY faster than I would be getting off the ground and making the first few working versions. -- Daniel Feldman
  • THANK YOU for Aider! It really feels like a glimpse into the future of coding. -- derwiki
  • It's just amazing. It is freeing me to do things I felt were out my comfort zone before. -- Dougie
  • This project is stellar. -- funkytaco
  • Amazing project, definitely the best AI coding assistant I've used. -- joshuavial
  • I absolutely love using Aider ... It makes software development feel so much lighter as an experience. -- principalideal0
  • I have been recovering from multiple shoulder surgeries ... and have used aider extensively. It has allowed me to continue productivity. -- codeninja
  • I am an aider addict. I'm getting so much more work done, but in less time. -- dandandan
  • After wasting $100 on tokens trying to find something better, I'm back to Aider. It blows everything else out of the water hands down, there's no competition whatsoever. -- SystemSculpt
  • Hands down, this is the best AI coding assistant tool so far. -- IndyDevDan
  • Best agent for actual dev work in existing codebases. -- Nick Dobos

Convert PDF to markdown quickly with high accuracy


Marker

Marker converts PDF to markdown quickly and accurately.

  • Supports a wide range of documents (optimized for books and scientific papers)
  • Supports all languages
  • Removes headers/footers/other artifacts
  • Formats tables and code blocks
  • Extracts and saves images along with the markdown
  • Converts most equations to latex
  • Works on GPU, CPU, or MPS

How it works

Marker is a pipeline of deep learning models:

  • Extract text, OCR if necessary (heuristics, surya, tesseract)
  • Detect page layout and find reading order (surya)
  • Clean and format each block (heuristics, texify
  • Combine blocks and postprocess complete text (heuristics, pdf_postprocessor)

It only uses models where necessary, which improves speed and accuracy.

Examples

PDFTypeMarkerNougat
Think PythonTextbookViewView
Think OSTextbookViewView
Switch TransformersarXiv paperViewView
Multi-column CNNarXiv paperViewView

Performance

Benchmark overall

The above results are with marker and nougat setup so they each take ~4GB of VRAM on an A6000.

See below for detailed speed and accuracy benchmarks, and instructions on how to run your own benchmarks.

Commercial usage

I want marker to be as widely accessible as possible, while still funding my development/training costs. Research and personal usage is always okay, but there are some restrictions on commercial usage.

The weights for the models are licensed cc-by-nc-sa-4.0, but I will waive that for any organization under $5M USD in gross revenue in the most recent 12-month period AND under $5M in lifetime VC/angel funding raised. If you want to remove the GPL license requirements (dual-license) and/or use the weights commercially over the revenue limit, check out the options here.

Hosted API

There is a hosted API for marker available here. It's currently in beta, and I'm working on optimizing speed.

Community

Discord is where we discuss future development.

Limitations

PDF is a tricky format, so marker will not always work perfectly. Here are some known limitations that are on the roadmap to address:

  • Marker will not convert 100% of equations to LaTeX. This is because it has to detect then convert.
  • Tables are not always formatted 100% correctly - text can be in the wrong column.
  • Whitespace and indentations are not always respected.
  • Not all lines/spans will be joined properly.
  • This works best on digital PDFs that won't require a lot of OCR. It's optimized for speed, and limited OCR is used to fix errors.

Installation

You'll need python 3.9+ and PyTorch. You may need to install the CPU version of torch first if you're not using a Mac or a GPU machine. See here for more details.

Install with:

pip install marker-pdf

Optional: OCRMyPDF

Only needed if you want to use the optional ocrmypdf as the ocr backend. Note that ocrmypdf includes Ghostscript, an AGPL dependency, but calls it via CLI, so it does not trigger the license provisions.

See the instructions here

Usage

First, some configuration:

  • Inspect the settings in marker/settings.py. You can override any settings with environment variables.
  • Your torch device will be automatically detected, but you can override this. For example, TORCH_DEVICE=cuda.
    • If using GPU, set INFERENCE_RAM to your GPU VRAM (per GPU). For example, if you have 16 GB of VRAM, set INFERENCE_RAM=16.
    • Depending on your document types, marker's average memory usage per task can vary slightly. You can configure VRAM_PER_TASK to adjust this if you notice tasks failing with GPU out of memory errors.
  • By default, marker will use surya for OCR. Surya is slower on CPU, but more accurate than tesseract. If you want faster OCR, set OCR_ENGINE to ocrmypdf. This also requires external dependencies (see above). If you don't want OCR at all, set OCR_ENGINE to None.

Convert a single file

marker_single /path/to/file.pdf /path/to/output/folder --batch_multiplier 2 --max_pages 10 --langs English
  • --batch_multiplier is how much to multiply default batch sizes by if you have extra VRAM. Higher numbers will take more VRAM, but process faster. Set to 2 by default. The default batch sizes will take ~3GB of VRAM.
  • --max_pages is the maximum number of pages to process. Omit this to convert the entire document.
  • --langs is a comma separated list of the languages in the document, for OCR

Make sure the DEFAULT_LANG setting is set appropriately for your document. The list of supported languages for OCR is here. If you need more languages, you can use any language supported by Tesseract if you set OCR_ENGINE to ocrmypdf. If you don't need OCR, marker can work with any language.

Convert multiple files

marker /path/to/input/folder /path/to/output/folder --workers 10 --max 10 --metadata_file /path/to/metadata.json --min_length 10000
  • --workers is the number of pdfs to convert at once. This is set to 1 by default, but you can increase it to increase throughput, at the cost of more CPU/GPU usage. Parallelism will not increase beyond INFERENCE_RAM / VRAM_PER_TASK if you're using GPU.
  • --max is the maximum number of pdfs to convert. Omit this to convert all pdfs in the folder.
  • --min_length is the minimum number of characters that need to be extracted from a pdf before it will be considered for processing. If you're processing a lot of pdfs, I recommend setting this to avoid OCRing pdfs that are mostly images. (slows everything down)
  • --metadata_file is an optional path to a json file with metadata about the pdfs. If you provide it, it will be used to set the language for each pdf. If not, DEFAULT_LANG will be used. The format is:
{ "pdf1.pdf": {"languages": ["English"]}, "pdf2.pdf": {"languages": ["Spanish", "Russian"]}, ...}

You can use language names or codes. The exact codes depend on the OCR engine. See here for a full list for surya codes, and here for tesseract.

Convert multiple files on multiple GPUs

MIN_LENGTH=10000 METADATA_FILE=../pdf_meta.json NUM_DEVICES=4 NUM_WORKERS=15 marker_chunk_convert ../pdf_in ../md_out
  • METADATA_FILE is an optional path to a json file with metadata about the pdfs. See above for the format.
  • NUM_DEVICES is the number of GPUs to use. Should be 2 or greater.
  • NUM_WORKERS is the number of parallel processes to run on each GPU. Per-GPU parallelism will not increase beyond INFERENCE_RAM / VRAM_PER_TASK.
  • MIN_LENGTH is the minimum number of characters that need to be extracted from a pdf before it will be considered for processing. If you're processing a lot of pdfs, I recommend setting this to avoid OCRing pdfs that are mostly images. (slows everything down)

Note that the env variables above are specific to this script, and cannot be set in local.env.

Troubleshooting

There are some settings that you may find useful if things aren't working the way you expect:

  • OCR_ALL_PAGES - set this to true to force OCR all pages. This can be very useful if the table layouts aren't recognized properly by default, or if there is garbled text.
  • TORCH_DEVICE - set this to force marker to use a given torch device for inference.
  • OCR_ENGINE - can set this to surya or ocrmypdf.
  • DEBUG - setting this to True shows ray logs when converting multiple pdfs
  • Verify that you set the languages correctly, or passed in a metadata file.
  • If you're getting out of memory errors, decrease worker count (increased the VRAM_PER_TASK setting). You can also try splitting up long PDFs into multiple files.

In general, if output is not what you expect, trying to OCR the PDF is a good first step. Not all PDFs have good text/bboxes embedded in them.

Benchmarks

Benchmarking PDF extraction quality is hard. I've created a test set by finding books and scientific papers that have a pdf version and a latex source. I convert the latex to text, and compare the reference to the output of text extraction methods. It's noisy, but at least directionally correct.

Benchmarks show that marker is 4x faster than nougat, and more accurate outside arXiv (nougat was trained on arXiv data). We show naive text extraction (pulling text out of the pdf with no processing) for comparison.

Speed

MethodAverage ScoreTime per pageTime per document
marker0.6137210.63199158.1432
nougat0.4066032.59702238.926

Accuracy

First 3 are non-arXiv books, last 3 are arXiv papers.

Methodmulticolcnn.pdfswitch_trans.pdfthinkpython.pdfthinkos.pdfthinkdsp.pdfcrowd.pdf
marker0.5361760.5168330.705150.7106570.6900420.523467
nougat0.440090.5889730.3227060.4013420.1608420.525663

Peak GPU memory usage during the benchmark is 4.2GB for nougat, and 4.1GB for marker. Benchmarks were run on an A6000 Ada.

Throughput

Marker takes about 4GB of VRAM on average per task, so you can convert 12 documents in parallel on an A6000.

Benchmark results

Running your own benchmarks

You can benchmark the performance of marker on your machine. Install marker manually with:

git clone https://github.com/VikParuchuri/marker.gitpoetry install

Download the benchmark data here and unzip. Then run benchmark.py like this:

python benchmark.py data/pdfs data/references report.json --nougat

This will benchmark marker against other text extraction methods. It sets up batch sizes for nougat and marker to use a similar amount of GPU RAM for each.

Omit --nougat to exclude nougat from the benchmark. I don't recommend running nougat on CPU, since it is very slow.

Thanks

This work would not have been possible without amazing open source models and datasets, including (but not limited to):

  • Surya
  • Texify
  • Pypdfium2/pdfium
  • DocLayNet from IBM
  • ByT5 from Google

Thank you to the authors of these models and datasets for making them available to the community!