blog > Docs rework (2022 Apr 23)
Alright, finally after 2.5 months of on-off work, the rework/revamp of this documentation website is done. Previously this website was just a single (huge) page with a lot of text and symbol information in between, now that same information is split between the docs and functions/structs/enums/variables pages. There's also this new 'blog' section, so I can put more 'discovering' style content in here and keep the docs more clean and tidy and to-the-point.
In the beginning I wrote the docs html manually, which was fine for a while
but soon became a tedious job. I'm not a big fan of Markdown because it seems
like many places use a different dialect (the meaning for __
/**
/*
may
differ, for example on Slack and Discord) and certain elements (like links)
break the flow of the text in source form. So at some point I came up with an
idea for a markup language and I made a first implementation of mmparse.c.
I named this 'margin markup' because the configuration of the presentation
would be put in the document's margin. It has its own problems of course (for
example you constantly need to reorder/copy the margin when editing lines) but
so far I think the positives outweigh the negatives.
That was better, but I still copied symbol information from my IDA database into the docs manually. That means it could get outdated very easily, information would be spread between the docs and the IDA database (because usually I put function descriptions in the docs but not as comments in IDA) and it is just an annoying chore to synchronize it all the time.
Then at some point I had the idea of just generating these docs based on the
IDA database. Since the IDA database is a binary file, which is undocumented
(I think), and may change between versions of IDA, I decided to use the
'dump database to IDC file' feature of IDA and parse that IDC file instead.
IDC is some kind of custom c-like scripting language, so I wrote a parser
for that (docs/idcparse.c
).
After doing that, I moved all of the function/structure documentation into comments in IDA, which now also get parsed using the 'margin markup' format I made. Now I can write docs and blogposts and easily link to all of the symbols, because they are written to their html pages and anchors are added as needed.
Small example:
Here's a function {529B50} and variable {838454} struct {struct SmsData} and ||| ref,ref,ref enum {enum SMS_TYPE} with members {{struct SmsData+2} == {enum SMS_TYPE/0x4}} ||| ref,code,ref,ref struct inner member and chained members {{struct Career.5D20.1404}} ||| code,ref with pointer {{struct SmsMessage.0.2}} ||| code,ref chained from variable of type struct {{838640.328}} ||| code,ref chained from variable of type struct pointer {{8384C4.8.E4}} ||| code,ref
Results in:
Here's a function SmsMessageList::SendOutrunInfoSms and variable smsDatas struct struct SmsData and
enum enum SMS_TYPE with members type == SMS_TYPE_OUTRUN_INFO
struct inner member and chained members struct Career.smsMessageList.numUnreadMessages
with pointer struct SmsMessage.data->type
chained from variable of type struct shownDialog.numButtons
chained from variable of type struct pointer pUIData->field_8->fngPackagesDC.__parent.first
Docs rework (continuation)
When a struct or enum changes name now, I'll get an error while the docs are being generated. That way the docs can't be too desynchronized anymore from what I have in the IDA database. Function and variable name changes are free because those are referenced by their addresses, so nothing in the docs source needs to be updated for those.
Now I probably won't touch this project for a few months while I go back to another project that I haven't touched in months :^)