Blog

What are we missing? Part 2

It took quite a while to edit the second part, but I hope it is worth the wait.

Optional Semicolons

Once upon a time, BASIC didn’t need any instruction termination symbol. If you wanted to stick two or more instructions on the same line, you had to separate them with a colon (yes, this was before semicolons). Then it was Pascal and C, and the termination/separation character made its appearance (well, maybe history didn’t unfold exactly like this, but this is, more or less, how my relationship with the instruction termination evolved).

Scala, Python, and other languages do not need semicolons or make their use optional in most contexts. This isn’t a great save, but it indeed makes me wonder why we need semicolons in C++; isn’t the “missing semicolon” one of the most frequent syntax errors? And if the compiler can tell that a semicolon is missing, couldn’t the compiler put it there for me?

Well, I guess the problem is backward compatibility. The semicolon-free parser would give a different meaning to existing code. Consider, for example, expressions that are split over multiple lines. In C++, it is ok to evaluate an expression and throw the result away. So, introducing a new statement separation syntax would be a mess – code that used to work may now present subtle problems hard to spot in debugging and code reviews.

Nonetheless, coding without semicolons is somewhat liberating, and remembering to put that character at the end of lines is a custom that I need a while to get back to when switching from Scala to C++.

Garbage collection

C++ has a strange relationship with garbage collection. This may come as a surprise to many, but in the first C++ book, The C++ Programming Language, Stroustrup wrote that C++ could optionally support garbage collection. Microsoft, in the early years of .NET, introduced a C++ extension (managed C++, then C++/CLI) to handle managed pointers – a different class of pointers for garbage-collected objects.

C++ had even a minimal support for GC, leveraged by some libraries such as the Boehm-Demers-Weiser. So, C++ is not a stranger to garbage collection, but this automatic way of deallocating objects has never caught on. In C++23, the minimal GC support was abruptly removed.

The common way for modern C++ to manage memory is via automatic objects and smart pointers. Automatic objects are allocated on the stack, and they are automatically destroyed when the execution leaves the scope where they were allocated. Smart pointers are defined by the standard library, and they provide reference-counting pointers that will automatically dispose of the pointed object when it is no longer used. By properly using std::unique_ptr and std::shared_ptr, memory management headaches are mostly gone.

Many languages went the other way, having garbage-collected objects as the default way to handle memory, with an optional way to allocate and manually free a bunch of memory.

So, what are the advantages of garbage collection? Well, there are three main advantages:

  1. no reference counting management penalty (paid each time you copy/assign a shared pointer around);
  2. thread safety (starting from C++20, there is a std::atomic partial specialization for std::shared_ptr (std::atomic<std::shared_ptr<T>>) that can be used, but – of course – you would pay an extra time for reference count update)
  3. GC works fine with reference loops – such as circular lists – while reference counting has troubles with these data structures.

Garbage collection lets the object exist with no additional space overhead, and the time overhead is paid once in a while for a periodic memory scan that finds unused references and disposes unreferenced objects.

There are two main problems with GC:

  1. Periodic execution of the collector may impact the performance of the application. GC indeed made huge advances in this area; still, for real-time applications, it may be an issue to keep under control.
  2. Object disposal happens after the object’s last use, but you don’t control when. C++’s predictable destruction time allows C++ programmers to implement the RAII idiom.

So there are pro and cons, what I like about GC is that you don’t have to care about dynamic memory – in C++ I have to think whether the object is referenced only here (unique_ptr) or may be accessed by several parts of the code (shared_ptr), and then maybe I have naked pointers around I should take care of, and maybe I have to transform a smart pointer into another. As you can see, it is not as straightforward to allocate the object and let the GC do the work.

Lazy Values

This one is a bit unusual for the C++ programmer, but it definitely makes sense. Consider a variable with an expensive initialization:

class Foo
{
  val bar = f()
}

In this code, the call to f() happens each time an instance of Foo is created. Now, suppose that according to the execution context, the bar variable is never used. That’s a pity; the code is unnecessarily performing computationally heavy tasks.

The lazy attribute can be used like this:

class Foo
{
  lazy val bar = f()
}

And means that the function f() will be called at the first reference of the variable bar. Should we want to rewrite this in C++, it would be something like:

class TheTypeIWantJustOneInstance {
  T getBar() const {
    if( bar == std::nullopt ) {
      bar = f();
    }
    return *bar;
  }
  mutable std::optional<T> bar = std::nullopt;
};

Ugly and not very readable, the mutable specifier is really the flashing warning sign that something bad is ongoing.

The lazy tool is also useful for creating infinite data structures or processing a subset of a large amount of data without the need to compute or retrieve all the data of the superset.

Of course, there’s more to make this work properly in a multithreaded environment, with shared resources and order initialization defined by access. The only “undefined behaviour” is with recursive initialization (i.e., to initialize a, you need b. But to initialize b, you need a).

Object

The C++ language has no native notion of Singleton, so they are typically implemented as:

class TheTypeIWantJustOneInstance {
  public:
    static TheTypeIWantJustOneInstance& get() {
      static TheTypeIWantJustOneInstance instance;
      return instance;
    }
    ...
};

This may not be very thread safe since if the method get() is concurrently called by two threads, you could get instance initialized twice (at the same address… not good). But even if the thread-safety problem is addressed or avoided, the reader still has to decode a pattern of code to identify this as a singleton.

Scala offers the singleton construct natively. It is called “object”, and it looks like this –

object InstanceOfTheTypeIWantJustOneInstance {
  ...
}

The object construct offers a different perspective on class data. In C++, you can define a member variable or a member function to be static so that it is shared among all the instances of a class. In Scala, there is no such concept, but you can use the companion object idiom.

A companion object is an object that has the same name as an existing class. Methods and variables of the class have no special access to the companion object – they still need to import the symbols to access them. But from the user’s point of view, you can use the Class.member notation to access a member of the companion object. This gives quite a precise feeling of accessing something that is related to the class and not to the instance.

This example is from my solutions to the Advent of Code:

object Range {
  final val Universe = Range( 1, 4000 )
}


case class Range( start: Int, count: Int ) {
  def end = start+count
  def lastValue = end-1

  //...
  def complement : List[Range] =
    import Range.Universe
    assert( start >= Universe.start )
    assert( end < Universe.end )
    val firstStart = Universe.start
    val firstCount = start-Universe.start
    val secondStart = end
    val secondCount = Universe.end-end
    List( Range( firstStart, firstCount), Range(secondStart, secondCount ))
      .filter( _.isNonEmpty )

}

In this example, the class Range defines a numerical range (first value, count). The companion object contains a constant (Universe). The complement operation needs to access the Universe to compute the complement of a range. As you can see, to use the Universe symbol, the Universe class needs to import it.

Another interesting application is to use the companion object to provide additional constructors for the class. Using the apply method (that works like C++ operator()), you can create a factory:

object SimpleGrid
{
  def apply[A: ClassTag]( width: Int, height: Int, emptyValue: A ) : SimpleGrid[A] =
    val theGrid: Array[Array[A]] = Array.ofDim[A](height, width)
    theGrid.indices.foreach(
      y => theGrid(y).indices.foreach(
        x => theGrid(y)(x) = emptyValue
      )
    )
    new SimpleGrid(theGrid)

  def apply[A: ClassTag]( data: List[String], convert: Char => A ) : SimpleGrid[A] =
    val theGrid: Array[Array[A]] = Array.ofDim[A](data.length, data.head.length)
    data.indices.foreach(
      y => data(y).indices.foreach(
        x => theGrid(y)(x) = convert(data(y)(x))
      )
    )
    new SimpleGrid(theGrid)
}

Here, the companion object for the SimpleGrid class provides two alternate constructors. The first accepts grid width and height, and the default content for a cell. The second constructor accepts a list (of lists) and a function to convert the content of the list into cell initialization.

I find this approach interesting because it provides a native singleton concept and, at the same time, simplifies the class construct, removing the burden of class methods and fields.

Conclusions

In this post, we have explored several key concepts and constructs that distinguish C++ and Scala. Some are just syntactic sugar, like lazy vals and objects. You can argue that you can define your CRTC to implement them in a C++ library, but having them in the language sets the standard way for using these constructs, defines the dictionary if you want.

Other concepts are more drastically different – the memory management (alongside the principle that everything structured is accessed by reference) being the most evident. I am not a big fan of GC having delved more than once in optimizing memory usage to avoid that garbage collection spoiling the game (literally game). But aside from the point of relieving the programmer from low-level memory management care, garbage collection allows for better handling of objects.

In the next installment, we’ll go into the more advanced functional direction.

What are we missing? Part 1

I remember when I was (really) young, the excitement of the discovery when learning a programming language. It was BASIC first. That amazing feeling of being able to instruct a machine to execute your instructions and produce a visible result! Then came the mysterious Z80 assembly, with the incredible power of speed at the cost of long, tedious hours of hand-writing machine codes and the void sensation when the program just crashed with nothing but your brain to debug it.

A few years later, I was introduced to C. A shiny new world, where the promise of speed paired up with the ease of use (well, compared to hand-written assembly, it was easy indeed). And later on, C++. Up to this point, it seemed like a positive progression; each step had definitive advantages over the previous one, no regret or hesitation in jumping onto the new cart.

Continue reading “What are we missing? Part 1”

Professional Programmer

And the next discussion topic was “Are you programmer professionals? And what does it mean?”

What promised to be a C++ meet-up about topics that could spark a flame war turned into a thought-provoking moment.

It started lightly with the east-const vs west-const question (obviously east-const is the right answer), then things got much more foundational.

Continue reading “Professional Programmer”

Elf and Guards – Days 5 and 6

It couldn’t last forever—spending consecutive days writing about my progress in the Advent of Code took too much time and the rest of my life knocked on the door. In this post, I will try to update you quickly on days five and six.

On day 5 we had to assist an unfortunate handbook printer to get the updated pages in the correct order. Being magic-elven stuff, the proper order is not the natural increasing order of integers, but a custom order defined by number pairs – e.g. 43|12 means that page 43 has to be followed immediately by page 12. But stuff is not that simple, the order relationship is not linear like abc, but may branch, so to give you an idea, after a may come either b or d and then are both followed by c.

Continue reading “Elf and Guards – Days 5 and 6”

Finding XMAS – Day 4

Day 4 of the Advent of Code presents you with two new puzzles based on the word search puzzle idea. For some reason, you are teleported to the Ceres Elven Station (that made an appearance in AoC 2019), but there’s nothing for you to see here (yet) – not even the chief historian we looking for. So a small elf asks for your help to solve her word search.

Being Xmas elves the word you have to look for is “XMAS”, it can be written straight or reverse and can be written in any direction left, up left, up, up right, right down right, down and down left.

Continue reading “Finding XMAS – Day 4”

Elves’ Programming Language – Day 3

There are plenty of programming languages and, of course, Xmas elves have their own. Although your main goal is still to find the Chief Historian, we are now in a warehouse, historian minions are wandering around looking for their boss and we are tasked to fix the computer1.

This puzzle seemed a bit easier than the first two days, but I found it somewhat underspecified. You have to scan a text for patterns like “mul(n,m)” with n and m integers. For each pattern multiply n by m and sum all the products together.

Continue reading “Elves’ Programming Language – Day 3”

Elves’ Advent Quest – Day 2

After warming up with the first day’s puzzle, here we are with the second day. The bar is still low, but not as low as yesterday. With the coordinates decoded in the last puzzle, everyone runs to the nearest location which happens to be a nuclear plant (for reindeer needs). While we are here the elves running the plant ask for your help to analyze some data looking for anomalies.

Yes, we are in a hurry to look for the chief historian elf, but we also have 25 days to fill with puzzles. So here we go.

Continue reading “Elves’ Advent Quest – Day 2”

Helping Elves One Star at a Time

That period of the year has finally come. That cozy winter feeling of cold outside the window, while busy waiting for the holidays. As a long-standing tradition that started last year, this is the time I help Elves prepare for Xmas in the Advent of Code.

Each day of Advent, the Advent of Code, proposes a couple of puzzles that can be solved by writing some code. There are no constraints on the language or the approach since the result of the puzzle is an integer number. Also, there is no deadline, you can continue working on the puzzles after Xmas. And there is no prize, you just know you did the right thing helping the elves and Santa.

Continue reading “Helping Elves One Star at a Time”

No Future for Us to C

It has been a while since some official documents produced by the USA administration advised against using unsafe programming languages like C and C++ (yes, C and C++ are explicitly mentioned). Now, the news resurfaced on the web with an added deadline—manufacturers have until January 1, 2026, to create a memory-safe roadmap for existing applications.

Let’s have a look at the text:

Recommended action: Software manufacturers should build products in a manner that systematically prevents the introduction of memory safety vulnerabilities, such as by using a memory safe language or hardware capabilities that prevent memory safety vulnerabilities. Additionally, software manufacturers should publish a memory safety roadmap by January 1, 2026.

This is pretty strict, even if it is phrased with “should” and not “shall” or “must”. If you develop a new product, you’d better drop C or C++. Moreover, if you have an existing product written in one of these pesky languages, you should provide a roadmap to memory safety.

Continue reading “No Future for Us to C”

Lambda World 2024

It was 2019 and nothing suggested that 2020 would not have been another regular year – another spring, another early bird ticket for the Lambda World conference, another regular summer, and then the most awaited conference of the year, a journey to Cadiz to attend Lambda World 2020.

Things went quite differently (for those reading from other planets – because of the pandemic) and the world was suddenly locked down, some conferences went online, and some were canceled, Lambda World included. Things went a bit better in 2021, but no news about a new edition of the functional programming conference. It was 2022, and many conferences had returned to the in-presence format, maybe with some precaution, but the worst part of the pandemic was finally over. Again no trace of Lambda World. 2023 came and went, and still no sign of lambda-life from Cadiz. Still vague or no answer to my emails to the organizing company.

And then suddenly and unexpectedly, the email popped up in my incoming folder – early bird ticket sale for Lambda World 2024 is open. I was so happy they were back, but at the same time I was about to change jobs and it was not the best approach to arrive on the first day at the new workplace and pretend to be sponsored to attend the conference. So, after a brief family check, I decided to self-sponsor my attendance and hope for at least some partial refund.

Continue reading “Lambda World 2024”