Home News Examples Demo Downloads FAQ Documentation Mailing Lists License
Support GeSHi!
If you're using GeSHi, why not help GeSHi out? You can link to GeSHi with this image:
Powered by GeSHi
Get the HTML

Project Status
The latest stable version of GeSHi is, released on the 19th of Aug, 2012.

Supported Languages:
*Apache Log
*APT sources.list
*ASM (m68k)
*ASM (pic16)
*ASM (x86)
*ASM (z80)
*Backus-Naur form
*C for Macs
*C++ (with QT)
*Diff File Format
*DOT language
*FourJ's Genero
*INI (Config Files)
*Java 5
*KLone C & C++
*Objective C
*OpenOffice BASIC
*Oracle 8 & 11 SQL
*Pixel Bender
*Progress (OpenEdge ABL)
*Ruby on Rails
*Uno IDL
*VIM Script
*Visual BASIC
*Visual Fox Pro
*Visual Prolog
*Windows Registry Files

GeSHi is the current stable release, with eighteen new languages and bug fixes over the last release.

GeSHi 1.1.2alpha5 is the current latest version from the development branch, with full C support (see the GeSHi development website).
Mailing Lists
HomeNewsExamplesDemoDownloadsFAQDocumentationMailing ListsLicense 
3:34 am GMT

GeSHi News

Here's where you can find out all the latest news about GeSHi - new releases, bug fixes and general errata.

GCC: GeSHi Contribution Contest
Hi folks!

We recently asked all of you for some code for our Code Repository which we mainly use to verify language files to work properly, but which we also use to work on some additional functions that might come in handy for GeSHi.

There has been some input already, but far too little for some of the projects we are working onn. This Code Repository is open for everyone and free to use in own contributions for GeSHi.

So if you like to have your name included in the THANKS file there's one easy way: write a small script, that uses GeSHi (and optionally the CodeRepo) to achieve a new function that other users of GeSHi might find helpful in their daily work, for their applications or which shows what's all possible with current versions of GeSHi.

The rule for this contest are - just as using GeSHi - very simple:
  • Submissions must be at BenBE (AT) geshi _DOT_ org by December 1st 00:00 UTC
  • Submissions must be released under GPLv2 or GPLv3
  • Submissions should be well documented and easy to understand (usage and code).
  • The code should be secure against CSRF, XSS, SQL-Injections and other common forms of Web Attacks
  • No other libraries (except GeSHi) should be used.
  • It should be innovative ;-) So please not Yet Another Pastebin ;-)

Depending on the number of submissions we will announce up to three winners. The winning submissions will be publically announced here and included in the release and all following.

I hope for some interesting submission which show various interesting tasks, GeSHi could be used for, or functions that would come in handy for upcoming GeSHi releases.

Unplanned server downtime and an upcoming release candidate
Today in the morning at 00:20 UTC until around 05:30 UTC there was an unplanned server downtime due to a crash of a hardware node this server is hosted on. For all of you who were desperatly hoping for the site to come back, here's some more news, you all probably will like to hear.

There will be another Release Candidate of this weekend which will fix some minor issues we've encountered since 1.0.8. There will be no new features compared to the first RC, just some improved documentation, some fixes to language files that showed ill behaviour and some corrections to language files that had some warnings when tested with our language file verification script.

There's only little left to do for the final release and so we think, we can announce the final work here soon.

First Release Candidate available
I've just updated the Release Branch of the GeSHi SVN to contain the latest version of what could become the next GeSHi release. Well, not quite, as there will still be some changes to the code, but the current release candidate should give you a short introduction to what you can expect of the next release.

The version in the Release Branch (namely contains mainly fixes over the 1.0.8 release and improvements to existing language files. So we fixed some problems with Symbol Highlighting (i.e. ; and | were ignored even if a language asked to highlight them).

Also we accidentially introduced an issue with line numbering where calls to start_line_numbers_at() got ignored with GESHI_HEADER_PRE_TABLE headers. Though the main problem, i.e. styling issues, could not be resolved yet. If you have a solution to them, contact us at the usual places.

But no new GeSHi release without new features ;-) The next version will allow you to highligh arbitrary stuff inside strings. What's new about this is not that you now can highlight escapes (which you already could quite a while) but you can e.g. highlight format string escapes or variable names inside PHP strings, OR correctly render Octal numbers ... There are thousands of possibilities and we only implemented a few common ones yet. To try this feature just feed some PHP source with lots of strings to the demo on this page and you'll see it ;-)

As mentioned a few weeks ago the next version will contain a fix to an issue where Remote Code Inclusion could have been possible. To avoid this, no colons are allowed in Language File Paths (except on Windows at the second char in the path). If you encounter any issues with this behaviour, fell free to get in contact so we can resolve this issue.

This release candidate doesn't yet contain updated documentation, but this will follow with the next RC. In the meantime feel free, to play around with this Release Candidate - it's installed on the server for you to test!

About security ...
Hi folks,

as I promised before I now will give some more details on a security issue found and fixed in 1.0.8. The issue was present in earlier versions, but had no harming effect there. As far as I know only has the problematic variant of this issue shown below.

Well, on to the issue: Many of you might know that GeSHi had a long time where with certain input it could be forced to output invalid HTML, because of a problem with incorrect ender generation under rare circumstances. This issue was mainly present with markup languages (no further details on this, though).

If given such prepared input GeSHi could be forced into an endless loop (in which produced 100% server load AND used up all the memory PHP was allowed to take (on some systems I could verify up to 2 GB!). Using this scheme an attacker could abuse this malfunction to do an Denial of Service Attack on the webserver (verified to work) with minimal (unsuspicious) input but unpredictable side-effects (if caused by this other programs randomly crashed due to missing memory). Previous versions only produced invalid XHTML when given that input.

A patch for this issue has been present in the SVN release branch as of 1.0.8rc2 and later and is included in the latest official 1.0.8 release. Especially administrators of Pastebins or other applications where users have free choice of input language AND source you should upgrade to the latest version as soon as possible.

Also there has been found an issue with the language file loading with low to medium severity depending on the system configuration. There is a possibility for Remote Code Inclusion under rare circumstances, if paths given to GeSHi aren't checked correctly. There is a patch available in SVN trunk that will be included in the next release. The severity of this problem is seen as low, because your Web Application has to be insecure itself for this attack to be feasable.

The next versions on GeSHi will concentrate on improving overall reliability and security of GeSHi, so don't hesitate to report any such issues (please by mail to for critical issues so we get some time to suggest a proper solution). Also don't be affraid if we do some more state reports on this, as we will take proper precautions to provide fixes before disclosing any details.

For those concerned about their webserver security I can recommend the use of the Suhosin Patches and the PHP Suhosin Extension. Both work well together with GeSHi and from own experience there are only few Web Applications that need to be changed (and changes are often only a few lines). There have been some patches for GeSHi in 1.0.8 that fixed some problems with preg_replace and the /e-Modifier, that can be disabled with mentioned extension. If you aren't sure, whether this extension works together with your application: There's a test-mode where only violations are reported, but no restrictions are enforced. This can be used for a slow transition towards this extension, that BTW runs on the GeSHi website without problems.

GeSHi 1.0.8 Released
Right on the scheduled date of release we are proud to announce the new version 1.0.8 of GeSHi, released 8:08 on 8th August 2008. There's plenty of bug fixes and thirteen new language files in this release making a total of 109. Also this release had a strong focus on performance and quite a load of other features added, improved or otherwise revised. One of the biggest changes you will notice is the boost on performance this new release brings. We sometimes got up to 75 KBs on our computers. So even larger sources should now be no longer a problem. As announced in the previous release this release contains the second set of changed color schemes. I hope you like it; it's at least by far more readable. If not, this release brings a new external style settings file feature where you can override the language defaults without touching the language files: Simply create a file and copy in the STYLES-array as $styledata.

Another big change in this version - if not the biggest at all - is the rewritten highlighting engine that now allows for some more flexibility and fixes some bugs that seemed unsolveable in previous releases.

I bet I still missed some patches that you guys have sent in :). Please send again if you think I missed any!

Lately there has been an issue with especially old PHP versions like 4.3.X - if you are using such an old version please consider upgrading PHP or if this is not possible check the bugtracker on updates on open issues and I'm sure you'll get information on patches affecting you in time. Please also note that some features added in this release will be auto-disabled in old PHP versions as I won't go through the hassle of writing work-around for legacy PHP systems.

Once again we want to remind you of our code repository: There we collect source snippets to test, verify AND improve GeSHi. So if you have some small snippet of a language not yet supported there, feel free to donate it!

Download from the usual place, bug reports to the tracker please etc. etc...

Call for Participation and Release Preview
Users of GeSHi, it's bright days for you with tons of new features and a much improved and streamlined code base: You'll get prettier code faster.

But we (BenBE and milian) are only two core developers hacking on GeSHi in our spare time. It's simply not possible that every single language of the 100+ supported ones is as shining and feature-rich as those we two use regularly. Often we don't even know the slightest of some of the languages we have support for. This is why we need your help:

  • Send us example code!
    During the 1.0.8 developing process we decided to start a code repository with snippets of various languages to test GeSHi on - yeah, that one I already mentioned here earlier. It served us well and we could spot various errors in some of the languages we would not have tested otherwise. So if you have a nice example of some language, send it to us. To check which languages are already covered, take a look at
  • Make your favourite language shine!
    Starting with 1.0.8 we added quite some features which your language might not yet use (but should!). Take the new number highlighting for example, it supports various formats of floats, hex, binary and octal numbers. But we cannot know the required setup for all of the 100+ languages! So it's your turn to send us some patches. Or write us an email saying: "Language Foo supports Bar, Asdf, and Zark style numbers."
    Maybe add a codesnippet as well (see above). This will help us for future releases.
  • If you are not able to fix your language by yourself, but spot an issue with GeSHi - please help by submitting a bugreport with code example and expected behaviour description. That way we can work something out. And don't forget to include some contact details so we can contact you in case of questions.

Other NEW features you might like:
  • Comment Regexps and Strict Block Regexp delimiters
    You already noticed the limitations of some tstuff simply not showing up correctly as it's too complex for simple find and replace? This is your chance!
  • Support for multiple styles for strings
    Your language supports different ways of enclosing literal data? You want them to show up differently depending of the type of delimiters? No problem with GeSHi 1.0.8! You just specify styles for different string delimiters and get what you want.
  • Multiple styles for numbers
    What goes for strings is possible for numbers too! Many languages have standard ways to define integers, floats or even octal or binary and hexadecimal values. Coming with the next release you can select what types of numbers your language supports and how to style them.
  • PARSE_CONTROL madness
    We give much more control to language files, which makes it possible to set per-keyword-group DISALLOWED_BEFORE/AFTER and more.

    You can now also explicitly disable given features of GeSHi if you don't need it, take a look at the text.php language file for an example.

And this are only some of the various improvements! Overall there's speed improvement all over the place! Fell free to test it out yourself!

Also there are - as always - some security fixes, including one somewhat critical that might be used to produce heavy load on your server. More details on this as soon as the final release is out.

As always we like to hear your oppinion. So feel free to use the usual contact mechanisms to send in your feedback.

BenBE and milian
Code, samples and statistics
I always was wondering how famous the online demo on this site really was. Well, it's not exactly what people call a pastebin, but you can use it as such if you store your source when initially highlighting.

But what I always was wondering about when looking at all those pastebins out there: What was the percentage of each highlighted language? Well, this depends on the visitors of such a pastebin, but also on how regularly they come visit this pastebin or if they update their sources regularly and thus produce dupes.

So how comes I'm doing this post? I stumbled upon the internal cache tables for the demo application that holds all the information to provide you with those little stats of average highlighting time, snippet size and the like. I played around with it a bit and though: There hasn't been such information out yet, so maybe people will like to hear about it. So here it comes:

For results to be as current as possible I did a quick lookup of most values just before starting thist post. At that time I had 123680 entries in the DB covering all languages GeSHi 1.0.8rc3 (the version on the server) supports. These entries cover dates from 12th June 2004 at 02:02 am to today, 04th August 2008 08:28 am, i.e. a total of 1514 days, and 6:30 hours. Given an even distribution of requests this would make one request every 15 minutes (1058 seconds between requests).

The traffic to the demo site is somewhat interesting too. The total size of all samples together is 336.41 MiB (352,754,019 Bytes) with an average of 2852 Bytes per request - roughly one page per request. It took about 52338.23 seconds of processing time for the server which equals 14 hours, 32 minutes and 18 seconds of continious work for the server.

Let's move on to the language ranking: Currently there are 108 supported languages in 1.0.8rc3. You see: 108 languages in 1.0.8rc3 ;-) All of them have been used although the distribution concentrates on about 20 languages. With 22857 hits and 70.75 MiB of input data at first place is (without surprise) PHP, followed (to my surprise) by ActionScript (29.58 MiB input)in 12313 samples. After this follows HTML (html4strict, 6449 hits, 29.39 MiB), C++ (cpp, 6211 hits, 15.39 MiB) and ABAP (abap, 5124 hits, 7.45 MiB).

Although this ranking stays quite stable when looking at input size and total hit count it dramatically changes once you look at the average sample size: If you do this you get ActionScript (french) ranked first with about 8.4 KiB per highlighted sample, followed by Plaintexts with 6.5 KiB and Tcl with 6.3 KiB per sample.

Let's draw some conclusions out of this:
  • People highlight different stuff than you expect
  • Although languages like C++, CSS, JavaScript, Perl and Python are quite famous around the globe they do not seem to draw too much attention to this demo
  • People seem to test primarily small samples and rarely highlight large ones.

And now to another statistical fact: The next release is overdue ;-) The work on the next release has already reached the attribution stable and only minor issues are left, but there is still some work open that will have to be fixed before the next version can be released to the^W^W^W^W go wild. This will - based on the current schedule for the release - be on 08th August 2008. The next version will not only have a new sub-minor release number, but also loads of new features, yet another bunch of optimizations and new languages, but also many other features I'll introduce to you soon. So stay tuned for the next updates!

An old bug dies ...
This might be surprising for you guys, but it isn't ;-) So, what happened?

Many of you might have noticed at some point, that GeSHi sometimes had problems with code like

echo "?>"; //This is still inside PHP

as described in a bugreport opened on 19th Oct 2005. The reason for this issue and also the reason it took so long to fix it was the way GeSHi determined where PHP-code (and other stuff for other languages like HTML, CFM) was. This has been changed today by a patch and thus an extension to the parser now supporting an somewhat more elaborate way of finding PHP-Blocks. The old way was simply to look for the start end end of such blocks by simple means like strpos. This works for simple cases like when the ender may not appear inside such an block itself, but breaks as soon as this prerequisite is not given - which it isnn't for CFM and PHP.

So the solution was simple: Allow the language file to describe a way to identify how such a block has to look like. This is done with regular expressions in the case of PHP and if you look at the most recent trunk version of the PHP language file you might notice some weird looking string in the place where according to the old documentation an array should have been. Well, that it is: The Strict Block RegExp! - Our new feature that already fixed an problem with the PHP language file, but will be reused to fix some more simular issues we already know about with other language files.

Therefore we might remind you of the Code Repo I posted about previously, since we only can fix issues we know about and can reproduce ourselves.

But Strict Block Regexps isn't the only new feature you'll note in the upcoming release. If you're curoius feel free to test out the Trunk SVersion of GeSHi or have a look at the Changelog we constantly update inside the SVN.

Farewell SF#1330968, that grew 1009 days, 14 hours and 27 minutes old! We'll miss you!

Call for Code
Maybe some people may already have noticed that we added a new directory to the SVN repostiory that we use for testing the highlighting of GeSHi.

This has allowed us to do many improvements to GeSHi recently while ensuring at the same time we didn't break any old stuff. But at the moment we are missing examples for various languages in our Code Repo so we can't check highlighting for all languages yet.

So what we need is:
  • Small pieces of code of approximately 10 - 50 lines of code
  • Typical code, i.e. they should illustrate what Code of this languages usually looks like, real-world examples preferred
  • Preferrably for yet uncovered languages
  • Not more comments as needed, basically as much comments as usual code of this language would have.
  • Avoid code like the Sun JRE where there's one page of documentation for 3 lines of code. If you have some code like this, try reducing the comments
  • The code snippet in itself should be free of syntax errors and be valid code for this language.
  • The code should use as many features of the language as possible. If this becomes too elaborate, split it up into multiple files.
  • The code must be available to be published in our repo (and stay there)

There are already some example snippets we gathered during debugging GeSHi, so feel free to contribute to them.

Please send your Code Snippets you'd like to contribute to this Code Repo to nigel @@ geshi .dot. org.

And if there is enough code available we even might release some additional package we are working on. But due to the lack of code, we couldn't test it on real-world examples yet.

Thanks in advance for any contributions!

BenBE and milian
100th language file for GeSHi included
Today in the afternoon (local time ;-)) while I was asking for support for some older issue "blocking" up the bugtracker at SourceForge I stumbled upon the people of the Boo interpreter. One of them already had a complete Boo language file laying around - that was number 99 of the languages added to GeSHi.

So now comes the interesting part: After negotiating how to transmit the file (as the tracker denied him attaching the file) he suddenly asked, if I'd mind getting yet another language file: CIL or Common Intermediate Language which is the .NET assembler language common to all .NET compilers. So I asked him to file me a bug attaching the language file and there we got: Language File #100!

There had been just another two files added in meantime so language file count is already up to 102, but I hope you folks will keep on sending in language files to add to improve language support even further.