software arcitecture

keep it simple

i have experienced it first-hand with colors in libreoffice codebase. basically if something requires a trick or some clever conditionals and case handling then that's not the right approach. such tricks later become maintainance nightmares. i have noticed this in the senior devs, they avoid such approaches whenever possible.

rapid prototyping

when you start the project, you see a lot of time but it disappears pretty quickly. rapid prototyping during the initial duration of the project helps find all the ways "not to implement" the project in and thus one can avoid the magical "i didn't know about this" failures in production. a prototype has to just work, doesn't have to be complete and the final design. you can spend an eternity planning out the perfect implementation but it never works out as you thought. iteration is the magic word in software.

clean atomic commits

each patch/commit that you create should do just one thing and shouldn't do anything else. please avoid unneccessary changes like formating and space changes as they just introduce noise to the patch making it harder to review. when someone is reviewing your patch, they don't know anything about it, so they have to load your-change into their head and make sense out of it and when they find that they loaded formatting changes, they have to take it out which impedes the process of reviewing. while programming, don't create commits at the end, do so along the way, debugging is easier.

review it yourself first

before pushing the patch, review it yourself both the small change that you just made and where it stands in the long series of changes or in relation to the change you are going to ammend it to. often it happens that an ammend voids the previously added comments and assumptions in code, so they need to be updated. then question every single line as if it's not your code and then push and ask someone else to review it.

refactoring

sometimes cleanly removing the whole feature to cleanup the surrounding code then cleanly adding back the code is one way to create clean reviewable commits. interfaces allow multiple implementations to exist but the interface api and the platfrom api might not align making implementation harder.

commit messages

the commit message should be clean and to the point. it should not include any text walls which might make it hard for the reader to distinguish real/useful information from some text that you thought would be nice there. write everything that your fellow developer from future should know about the code you are pushing, when they git blame and read the commit message, they should get the whole technical and design story of that change.

comments and documentation

comments should explain why the code is this way. this helps the readers of the code quickly get clues about what is going on so that they don't have to goto the definition of each call and figure things out. header files should have doxygen comments to describe the api and the internals of a struct like why it exists. here are a few code snippets from sanjay ghemwat's code, one can read the code with comments like prose.

// findfilter is a filter that produces matching nodes under a filesystem
// directory.
type findfilter struct {
    dir       string
    ifmode    func(os.filemode) bool
    skipdirif func(string) bool
}

// find returns a filter that produces matching nodes under a
// filesystem directory. the items yielded by the filter will be
// prefixed by dir. e.g., if dir contains subdir/file, the filter
// will yield dir/subdir/file. by default, the filter matches all types
// of files (regular files, directories, symbolic links, etc.).
// this behavior can be adjusted by calling findfilter methods
// before executing the filter.
func find(dir string) *findfilter {
    return &findfilter{
        dir:       dir,
        ifmode:    func(os.filemode) bool { return true },
        skipdirif: func(d string) bool { return false },
    }
}

data and code

data and code are mostly separate entities meaning that functions with logic should not have loads of data local to them and data should be stored systematically. don't have code duplication... ofcourse you won't know the best approach in the first go... just make it work and slowly iterate... cleaning up more and more in each step. abstract out details like magic numbers into enums and related data into it's type.

what you see/think is not what it is

good code is very descriptive about itself, it doesn't present a fragmented context/model of the problem to the reader but tries to fill in as many details as possible, either through comments or by using code constructs which make that aspect of the mental model 'obvious' at first look. the reader knows at the first look where to go next. enums and data structures are one way to do it, proper naming of functions and structures help quickly understand what data is being transformed in what way. but most of the code out there is not that good, some of it is mine ;).

opinions on approaches

as a developer you don't have to follow the code pointers religiously, they are there as an indicator of what some dev thought about the problem and it's potential solution. if it looks too complicated to you and you think there is a simpler solution then you should say that and see where it goes from there. you are responsible for your code so don't take the suggestions and feedback as orders that have to be followed.

documentation

documentation and manpages should have minimal examples so that the users of the program/library have something to hold on to. maybe this is what a tutorial is supposed to do. i hate man pages without any examples at the end.

the first attempt at ipc

recently i decided to write a toy project which created a libeditor.so file with some data and some getters and setters. then i created a client to link against libeditor.so and print the data over and over after 1 second of sleep. i then created another program which also linked against libeditor.so and set the string data to something else. i thought that this way i would be able to change the string being accessed by client program, i was off by a long margin.

turns out that shared objects are a way to organize code into a separate blob such that it's loaded once into the ram and the processes have the data stack/variables in their address space. it also helps with live reloading of code, i don't know how that happens under the hood but it's very well known usecase of dlls. these can also be loaded from a running code as plugins/extensions. ipc is different, it's how two programs communicate. ville was kind enough to share a really nice document on ipc on #C++-General

had another chat about ipc on #emacs. i started the conversation with a question about string operations being slow yet the api calls sending around strings. after some chats it came up that strings are not passed randomly but there's a structure to the whole api, or atleast there is supposed to be one. there is a serialization and de-serialization step happening with each call and most of the times the overhead is hidden away behind the socket connection setup but it depends on the api design. apis should be designed mindfully.