blog > Docs rework (2022 Apr 23)

Docs rework #

Alright, finally after 2.5 months of on-off work, the rework/revamp of this documentation website is done. Previously this website was just a single (huge) page with a lot of text and symbol information in between, now that same information is split between the docs and functions/structs/enums/variables pages. There's also this new 'blog' section, so I can put more 'discovering' style content in here and keep the docs more clean and tidy and to-the-point.

In the beginning I wrote the docs html manually, which was fine for a while but soon became a tedious job. I'm not a big fan of Markdown because it seems like many places use a different dialect (the meaning for __/**/* may differ, for example on Slack and Discord) and certain elements (like links) break the flow of the text in source form. So at some point I came up with an idea for a markup language and I made a first implementation of mmparse.c. I named this 'margin markup' because the configuration of the presentation would be put in the document's margin. It has its own problems of course (for example you constantly need to reorder/copy the margin when editing lines) but so far I think the positives outweigh the negatives.

That was better, but I still copied symbol information from my IDA database into the docs manually. That means it could get outdated very easily, information would be spread between the docs and the IDA database (because usually I put function descriptions in the docs but not as comments in IDA) and it is just an annoying chore to synchronize it all the time.

Then at some point I had the idea of just generating these docs based on the IDA database. Since the IDA database is a binary file, which is undocumented (I think), and may change between versions of IDA, I decided to use the 'dump database to IDC file' feature of IDA and parse that IDC file instead. IDC is some kind of custom c-like scripting language, so I wrote a parser for that (docs/idcparse.c).

After doing that, I moved all of the function/structure documentation into comments in IDA, which now also get parsed using the 'margin markup' format I made. Now I can write docs and blogposts and easily link to all of the symbols, because they are written to their html pages and anchors are added as needed.

Small example:

Here's a function {529B50} and variable {838454} struct {struct SmsData} and  ||| ref,ref,ref
enum {enum SMS_TYPE} with members {{struct SmsData+2} == {enum SMS_TYPE/0x4}} ||| ref,code,ref,ref
struct inner member and chained members {{struct Career.5D20.1404}}           ||| code,ref
with pointer {{struct SmsMessage.0.2}}                                        ||| code,ref
chained from variable of type struct {{838640.328}}                           ||| code,ref
chained from variable of type struct pointer {{8384C4.8.E4}}                  ||| code,ref

Results in:

Here's a function SmsMessageList::SendOutrunInfoSms and variable smsDatas struct struct SmsData and enum enum SMS_TYPE with members type == SMS_TYPE_OUTRUN_INFO struct inner member and chained members struct Career.smsMessageList.numUnreadMessages with pointer struct>type chained from variable of type struct shownDialog.numButtons chained from variable of type struct pointer pUIData->field_8->fngPackagesDC.__parent.first


Docs rework (continuation)

When a struct or enum changes name now, I'll get an error while the docs are being generated. That way the docs can't be too desynchronized anymore from what I have in the IDA database. Function and variable name changes are free because those are referenced by their addresses, so nothing in the docs source needs to be updated for those.

Now I probably won't touch this project for a few months while I go back to another project that I haven't touched in months :^)