Rajan Maghera

NameRajan Maghera
Phone587-783-5523
Emailrmaghera AT ualberta.ca

A Little Bit About Me

I'm a CS + Business student @ UAlberta and based in Edmonton, Alberta.

In recent years, I've taken a big interest in compiler design. I've devoted my time to learning about compiler infrastructure/optimizations and have worked on a few projects in the space. I'm currently working on a static analysis tool for RISC-V assembly code.

In the age of heterogeneous architectures and diminishing increases in raw power, writing and generating optimized code becomes more and more vital. The industry has turned to HPC, parallelism, and intelligent algorithms to fill the gaps. I belive that compilers are at the forefront of this unique era. My goal is to work on software that both performs well and teaches programmers about best practices in the space.

Outside of low-level stuff, I enjoy working on web development and learning about software engineering practices. I'm also a big fan of education and have worked as a TA at UAlberta.

Outside of CS, my hobbies include photography, videography, graphic design, cooking, running, and cheering for the Oilers.

Projects

RISC-V Assembly Static Analysis

[Summer 2023] Assembly ain't as hard as you think.

My pride and joy; a VSCode Extension to provide linting, calling convention, and control flow errors for RISC-V assembly.

Motivation
While I was a TA for CMPUT 229 (Computing Architecture course at UAlberta; assignments done in RISC-V assembly), the most common mistakes were around basic register convention. Things like register clobbering, invalid stack manipulation, procedure calls and other similar items. Depending on the case student code with incorrect conventions may or may not fail against our own private helper code or marking scripts. This model also propogates the it works for me mentality as the environment is pretty foreign for most 2nd year students.

Although professor J. Nelson Amaral did an amazing job teaching the course, the tooling for such items did not exist. We use RARS to assemble and simulate the runtime, but it only provides assembler-type error reporting.

We came up with this idea for a static analysis tool that would fill in the gaps by applying a set of RISC-V calling conventions to assembly source. The guiding ideas is that in the subset of

I am completing this project over Summer 2023 under an NSERC USRA.

Implementation
As I am the sole programmer on the project, I got to pick Rust as the primary language of implementation, and oh boy was it a great choice. Other than Rust being Rust, I chose it for two primary reasons:

it’s a low(ish)-level language, like C++ which was originally my first choice, and
easy cross-compilation to WASM

The second point was particularly important to me as I wanted the usage to be as easy as possible. Yea, we could’ve made just a command line tool that spits out warnings, but I wanted students to be able to see feedback in realtime in an IDE. As such, we targeted a VSCode Extension as the final form. You can use Language Server Protocol to trasmit this type of data to most IDEs, but it usually requires the user to install an external binary and ensure it is working properly. By using WASM, we can create a VSCode Extension that runs the WASM functions. Becuase VSCode ships with a Node runtime built-in, there are no external dependencies.

Before beginning the project, I planned to use ANTLR4 to generate a parser (when it was gonna be in C++). Partly because I wanted to play with Rust for a bit and becuase RISC-V ASM was so simple, I did the most Rust thing ever: just (re)write it in Rust, aka. from scratch. It honestly didn’t take as long as I thought as Rust has numerous traits that I just had to implement. You can probably imagine that it’s mostly a lot of Into and TryInto, and you’d be right.

The actual smart stuff derives from dataflow analysis. We came up with a bunch of very neat and elegant algorithms based off of liveness analysis and available expressions. This part took the longest, but I’m the most proud of it; the algorithms are clean and fast, and should scale well, even beyond RISC-V assembly. I’m not gonna bore you with the details of the algorithms, but you can look at the source if you’re interested.

Things I Learned

I <3 Rust so much. Like most Rustaceans, I feel like the language makes you smarter by enforcing its rules, and I’m happy for them.
Rust feels like a language built for building parsers.
The match/enum pattern should be used everywhere possible. It is the cleanest way to enforce its special type safety.
Exceptions are stupid. Result types should be the standard across languages.
Strict languages like Rust are great for large codebases where you have to ensure all programmers are developing at the same safe standards.
Time is a huge constraint to making good efficient software. I spent a huge chunk of time perfecting my dataflow algorithms, but I won’t have this luxury while working in industry.

Languages/Frameworks (Pure) Rust cross-compiled to WASM, TypeScript/Node for the VSCode extension host.
Source https://github.com/rajanmaghera/riscv-analysis (WIP, to be completed by August 2023)

Gazprea Compiler

[Fall 2022] A full compiler from (almost) the ground up.

A compiler for the Gazprea Language.

Motivation
After spending Summer 2022 working under J. Nelson Amaral, I was convinced to take CMPUT 415, compiler design. In this course, we are to (eventually) implement the front-end of a compiler targetting LLVM. The work was done in groups of 4. It was so much fun and hands down the most interesting course I’ve taken at the University of Alberta.

Gazprea is a language developed originally at IBM in Markham, ON for fast business operations. It has support for vector and matrix operations, in the mathematical definitions of those words, with a type system that incorporates the shape of the matrices. As well, functions are separate from procedures, where functions are pure functions only. I encourage you to check out the specifics of the language by reading the specification.

Implementation
The parser was generated using ANTLR4, an adaptive LL* parser generator. Our submission emitted LLVM IR, which (for testing) was interpreted with lli. Future generations of the course are planning to emit MLIR instead.

Our compiler did most things traditionally; we had our AST classes, methods that could walk our trees, passes, and symbol tables for types. We implemented our pass manager similar to how LLVM implemented its pass manager with support for “notating” a pass with a type of data. We, however, had no support for tracking pass invalidations and managed our pass selection via trial and error.

Our implementation passed about 90% of teaching team tests by the end of the course.

Due to this project being an assignment, I can’t disclose too many details about the project implementation.

Things I Learned

No matter how experienced a team is, large software engineering projects need a (proportionally) large amount of planning. Spending time early on to plan and distribute work will pay off well.
I had the honour to work with some very smart individuals on my team. However, I noticed that a common anti-pattern the team kept falling into was having the smart people do everything.
A working compiler is not hard, a perfect compiler is impossible. There were so many bugs and edge cases that we didn’t know how far to go down certain paths.
C++ is a good language if you enforce coding practices. Clang tidy is your friend.
UNIT TESTING WORKS. It’s not for ensuring a basic function works, it’s for ensuring later refactoring or additional code does not break your code. I use TDD everywhere now.

Languages/Tools C++, ANTLR4, LLVM
Source closed source :(

Social Athletics

[Fall 2019] Why hire a social media manager for sports teams?

A Google Apps Script based social media image/slideshow/tweet generator for local athletics scores and schedules.

Motivation
While I was in high school on the Men’s Rugby team, I was disappointed that the majority of the school had no idea that the sport even existed (even though we were Tier 1 in the city). Our school was small and consequently, the athletics culture was largely non-existant, let alone for a lesser known sport.

I knew that large sports legaues (and larger schools) employed social media managers to generate assets like score cards, schedules, and more. However, unless a student was willing to volunteer to do this, it would not happen. Because of this, I came up with the idea of making an app that automated this.

This project was also used as my IB Internal Assessment for Computer Science SL in which I scored a 7/7.

Implementation
Due to ease of use constraints, and probably because I didn’t know any better, I made the whole thing in GApps Script, which is essentially a superset of JS with easy bindings to Google services. The app was based in a Google Sheet that acted like a DB and control panel (tip: you can bind image clicks in Sheets to functions).

The service would first scrape our local athletics website for scores and schedules, then cache the data in the spreadsheet. On consequent scrapes of the website, any updated data would be marked for future creation. The spreadsheet also allowed for custom entries for anything that wasn’t on the website or sports that didn’t fit the traditional score vs. score reporting (think track or cheer).

About once a day, the script would create the posts that were earlier marked for creation. It would do this by transplanting the data into a template Google Slides slideshow. The backgrounds of slides could be customized with images for each specific team. Then, the image would be exported as a JPG and sent to an admin. In addition, the slides could be inserted into a different slideshow or tweeted as text.

The sheet contained a multitude of settings like updating the template colors/logo, scheduling “upcoming” posts, auto-removing old slides, where to tweet, and more.

Things I Learned
This was essentially my first software engineering project, even before I learned what that meant. It “ran” in the cloud, owing to Google Apps Script running my scripts every 5 minutes (thank you unlimited GApps Education accounts). Think serverless before I knew what serverless was. I would use the experience from this to fuel the rest of my CS career.

You can only finish a project if you have the motivation. I really liked marrying my world of CS, graphic design, and athletics. I think it was the main reason I didn’t need external motivators (like a class project).
Don’t over-optimize before the project is done. Without understanding how JavaScript really worked, I used arrays of arrays of arrays everywhere because I believed it was more “efficient”. Unless required otherwise, choose readability over efficiency first.
Objects exists in JS. Use them.
Server-side architecture is hard and requires a lot of thought.
The Google Sheets and Apps Script combo is not a bad way to prototype web-apps that require a rudementary database. As an added bonus, you get a premade UI to interact with it.
Use a library for web scraping, please.

Languages Google Apps Script
Source https://github.com/rajanmaghera/social-athletics

👋 Hi, I'm Rajan!

[pronounced raw-gin][he/him]

I'm interested in compilers, HPC, code optimization, educating, software engineering, and a tad bit of web stuff.