Test Data in TypeScript: Challenges, Solutions, and My Experience

Testing is an integral part of development, especially when it comes to complex applications. It allows us to ensure code quality and prevent unexpected errors. However, an important question every developer faces sooner or later is: how do you balance between thorough test coverage and the time spent writing them? Even more importantly, how do you avoid getting bogged down by the routine? Working with test data is a challenge that can exhaust even the most patient developers. Simple tests with primitive parameters are quick and easy to write. But when your method deals with complex objects with nested structures, the situation becomes significantly more complicated. With each new test, more time is spent preparing the data, reducing readability, and increasing overall workload. This story is not about promoting libraries or tools. It’s my personal experience with tackling typical issues related to data generation for TypeScript tests. I want to share the difficulties I’ve faced, how common approaches from other languages didn’t work, the solutions I eventually arrived at, and—hopefully—get some new perspectives on the problem from the community. The Test Data Problem in Unit Testing Writing a good test isn’t too difficult if the functional code adheres to SOLID principles and isn’t overloaded with logic—the test should also end up being simple and concise. But here’s where everything hits a wall: models. As long as we’re dealing with concepts like “method accepts a string”, “method returns an int”, everything works smoothly. However, when we move on to working with objects, the issue of supplying data to each required model in every test arises. At first glance, generating this data might seem trivial: we simply create objects with the required values. But in practice: Complexity of entities. The more complex the data types in the code (e.g., nested structures or collections with various objects), the harder it becomes to manually generate a valid test dataset. Quantity. There are a lot of tests, so preparing each dataset consumes a massive amount of time. Maintenance. During refactoring or changes to type structures, updating the test data manually becomes an additional burdensome task. Approaches to Generating Test Data Below, I’ll describe a typical progression for a developer writing tests—perhaps you’ll recognize something from your own project. Carrying Everything With You The first approach is based on the same primitive principle—if I can define the value of an int variable directly in the test, why not populate a reference type in the same way? For simple objects like a Role class with fields Id and Name, this works, of course. But where there’s a role, there’s also a user that this role belongs to. Then there’s UserInRole, which links these two entities... If there’s a method in the functional code like "retrieve all role names for a user with login X", we’ll need to define not just one but three models. The test grows as the model gets more complex. It becomes extremely inconvenient to read—how can you understand which of these details are truly important for the test (e.g., the user's login) and which are not? With this approach, we: Ruin readability. Forget about brevity. Spend a lot of time writing each test. Special Methods in Test Classes At some point, someone notices that the X class contains a lot of duplicate code across its tests. That duplicated code gets extracted into a separate method right there in the X class. Tests become short again. We've stopped violating the principle of not duplicating code—although wait, there’s no such specific prohibition for test code! So, what did this achieve? Can we say we’ve improved the conciseness of the tests? To be fair, yes, the code is now cleaner. That’s a small victory! But what about the downsides? Readability has not only failed to improve—it may have actually become worse. Now, to understand what’s happening in a test, you have to jump back and forth between the test and the factory method. Situations often arise where the same object needs to be used across different classes. If we create a fake in class A, what should we do in class B? Duplicate it? In a small project, this isn’t immediately noticeable, but once the project grows in scale, you find yourself drowning in problems. Handwritten "Databases" Eventually, a “genius” idea may arise: why not hard-code a fake database in static collections, predefine all the needed values and relationships, and use this data in our tests? There’s not much sense in dwelling on this approach, because it’s only suitable for small projects with three classes and (ideally) no relationships between them. This resolves nothing, and the supposed "central management" of fake data is completely overshadowed by major downsides: As the project scales, it becomes nearly impossible to account for and define all the relations

Mar 29, 2025 - 12:40
 0
Test Data in TypeScript: Challenges, Solutions, and My Experience

Testing is an integral part of development, especially when it comes to complex applications. It allows us to ensure code quality and prevent unexpected errors. However, an important question every developer faces sooner or later is: how do you balance between thorough test coverage and the time spent writing them? Even more importantly, how do you avoid getting bogged down by the routine?

Working with test data is a challenge that can exhaust even the most patient developers. Simple tests with primitive parameters are quick and easy to write. But when your method deals with complex objects with nested structures, the situation becomes significantly more complicated. With each new test, more time is spent preparing the data, reducing readability, and increasing overall workload.

This story is not about promoting libraries or tools. It’s my personal experience with tackling typical issues related to data generation for TypeScript tests. I want to share the difficulties I’ve faced, how common approaches from other languages didn’t work, the solutions I eventually arrived at, and—hopefully—get some new perspectives on the problem from the community.

The Test Data Problem in Unit Testing

Writing a good test isn’t too difficult if the functional code adheres to SOLID principles and isn’t overloaded with logic—the test should also end up being simple and concise. But here’s where everything hits a wall: models.

As long as we’re dealing with concepts like “method accepts a string”, “method returns an int”, everything works smoothly. However, when we move on to working with objects, the issue of supplying data to each required model in every test arises.

At first glance, generating this data might seem trivial: we simply create objects with the required values. But in practice:

  1. Complexity of entities. The more complex the data types in the code (e.g., nested structures or collections with various objects), the harder it becomes to manually generate a valid test dataset.
  2. Quantity. There are a lot of tests, so preparing each dataset consumes a massive amount of time.
  3. Maintenance. During refactoring or changes to type structures, updating the test data manually becomes an additional burdensome task.

Approaches to Generating Test Data

Below, I’ll describe a typical progression for a developer writing tests—perhaps you’ll recognize something from your own project.

Carrying Everything With You

The first approach is based on the same primitive principle—if I can define the value of an int variable directly in the test, why not populate a reference type in the same way?

For simple objects like a Role class with fields Id and Name, this works, of course. But where there’s a role, there’s also a user that this role belongs to. Then there’s UserInRole, which links these two entities... If there’s a method in the functional code like "retrieve all role names for a user with login X", we’ll need to define not just one but three models.

The test grows as the model gets more complex. It becomes extremely inconvenient to read—how can you understand which of these details are truly important for the test (e.g., the user's login) and which are not?

With this approach, we:

  • Ruin readability.
  • Forget about brevity.
  • Spend a lot of time writing each test.

Special Methods in Test Classes

At some point, someone notices that the X class contains a lot of duplicate code across its tests. That duplicated code gets extracted into a separate method right there in the X class.

Tests become short again. We've stopped violating the principle of not duplicating code—although wait, there’s no such specific prohibition for test code! So, what did this achieve? Can we say we’ve improved the conciseness of the tests? To be fair, yes, the code is now cleaner. That’s a small victory!

But what about the downsides?

  • Readability has not only failed to improve—it may have actually become worse. Now, to understand what’s happening in a test, you have to jump back and forth between the test and the factory method.
  • Situations often arise where the same object needs to be used across different classes. If we create a fake in class A, what should we do in class B? Duplicate it? In a small project, this isn’t immediately noticeable, but once the project grows in scale, you find yourself drowning in problems.

Handwritten "Databases"

Eventually, a “genius” idea may arise: why not hard-code a fake database in static collections, predefine all the needed values and relationships, and use this data in our tests?

There’s not much sense in dwelling on this approach, because it’s only suitable for small projects with three classes and (ideally) no relationships between them.

This resolves nothing, and the supposed "central management" of fake data is completely overshadowed by major downsides:

  1. As the project scales, it becomes nearly impossible to account for and define all the relationships within this system.
  2. The initializer for this fake context quickly grows into a ridiculously long monster.
  3. Developers using this approach tend to rely too much on the predefined data in these collections. Since the required value is already defined, they simply remember it and use it when writing tests. That’s it—you can kiss your code readability goodbye.

On top of that, these collections become untouchable—don’t dare to alter anything, because half the tests will immediately fail!

Factories

Eventually, you get tired of writing the same thing in every test, or micromanaging the handwritten “database,” and you decide to switch to object factories. For instance, if there’s a Role class, the testing project gets a RoleFakeFactory with methods like Create or CreateMany. Yes, you'll need to address some implementation decisions, such as whether these factories should be static, how they get data context, and other similar questions.

However, these aspects don’t significantly affect the usability of this approach. Again, with a small amount of code, this feels like the right solution, and it really seems to work. For the first couple of weeks, everyone is probably satisfied with the results.

So, what did we achieve?

We finally solved the duplication problem, as everything is now in its assigned place. That’s... basically it.

Downsides?

  • The code is still difficult to read. Yes, management of fakes is now centralized, but how do we interpret the values of these fields?
 public class UserFactory
 {
    public static Create(): User
    {
       return new User
       {
          Id = 345,
          Name = "Vasya",
          Email = "mail@test.test",
          PhoneNumber = "123456789",
          EmailConfirmed = true,
          PhoneNumberConfirmed = true
       }
    }
 }

EmailConfirmed == true for a specific test? For most tests? Is it a default value? It’s completely unclear where all these values came from and which ones are meaningful or simply irrelevant.

  • Readability remains at the same level as the previous approach.
  • The sacredness hasn’t gone away—we’ve just slightly reformatted the "magic" values without really changing the core approach.

Automatic Object Generation

Many programming languages provide built-in ways to generate random objects for the purposes of unit testing. While the specifics of how this generation is implemented may vary, it’s important (in the context of our discussion) to consider the consequences of such automation.

  • Objects are generated at runtime with random values, eliminating the need to guess the purpose of hardcoded variables—anything truly necessary for the test can be explicitly specified.
  • Class instances are created in a single line. Goodbye to factories, special methods, and extra utilities—we gain the ability to generate usable objects anywhere and with any configuration.

What did we achieve with automatic generation?

  • Code readability has significantly improved—we can now truly be proud of it. All necessary data for a test is located within the test itself.
  • Conciseness is also top-notch—everything written is focused on the specific needs of the test, with no unnecessary data cluttering up the code.
  • Performance of both test writing and reading has improved drastically. Depending on prior approaches, speed gains can be considerable—sometimes even multiple times faster.
  • Efficiency is directly tied to simplicity and convenience, and the easier these aspects are, the greater the developers’ engagement. The more eager developers are to write tests, the higher overall quality of the production code.

I often say that most things have already been invented for us, and programmers shouldn’t reinvent the wheel at every step. However, in the case of TypeScript, this concept falters...

Inspiration from C#: AutoFixture

AutoFixture has become a true ally in my C# testing endeavors. This library enables the automatic generation of objects based on their type, minimizing manual effort. AutoFixture works by leveraging reflection—access to type metadata at runtime—which makes it incredibly powerful and convenient.

After trying AutoFixture, I quickly grew fond of this approach: less repetitive code, more focus on test logic. It worked so seamlessly that when transitioning to TypeScript and Angular, I found myself missing such functionality.

TypeScript and the Test Data Problem

TypeScript is a powerful language that has become indispensable for developing complex frontend applications. However, it has an important limitation: all types are erased at compile time. Once the code is transpiled into JavaScript, all type information disappears.

Thus, what can be easily implemented using reflection in C# becomes unattainable within standard TypeScript code. I spent a long time searching for AutoFixture-like alternatives for TypeScript, but all my efforts were in vain. Either the libraries lacked sufficient functionality, or they required explicit type specifications when generating data—failing to resolve the core problem.

TypeScript Transformers as a Solution

During my search, I came across TypeScript Transformers. This is an incredibly powerful yet underutilized tool.

TypeScript Transformers are plugins that intervene in the process of compiling TypeScript to JavaScript. They allow you to add or modify code "on the fly." Thanks to transformers, it becomes possible to do what regular TypeScript code cannot—pass metadata about types into the final JavaScript.

Inspired by the idea, I set off to develop my own library for automatic object generation for tests.

At first, it all looked very promising, but... as with many powerful tools, there was a catch. The main difficulty is using TypeScript Transformers. It’s an incredibly useful technology, but it requires some "hoop-jumping." Since TypeScript Transformers interfere with the compilation process, they need to be manually integrated into a project.

As a result, simply installing the library via npm install was out of the question.

To work properly with transformations, you need to:

  1. Use a build tool that supports TypeScript Transformers (such as Webpack with ts-loader, or other tools like esbuild or ts-patch).
  2. Register the transformer in the compiler configuration—not just declare it but add a new phase to the project build process.
  3. Occasionally update the settings when TypeScript gets updated. Unfortunately, updates are not always seamless.

Although these steps may not sound overly complex, in practice, not every developer is willing to dive into build configuration just to use a single library. Over time, I was able to streamline everything into simple steps, but "interesting" approaches almost always lead to increased bugs and difficulties for users. Still, this is where I am currently—it’s the best solution I’ve come up with so far. The dependencies on the tooling remain an issue—not all build tools handle transformations effectively. For example, Webpack with ts-loader and tsc work fine, but lighter tools like Vite or esbuild may require an abundance of effort to configure.

In simpler terms, I couldn’t make my library "just work out of the box," which doesn’t personally bother me. In my standard Angular projects, everything works fine, as it does in rare "pure" TypeScript setups with Jest. But beyond that—there’s a fog of war I haven’t had time to penetrate.

Is It Worth It?

Given all the downsides of working with transformers, you might ask: is this even worth it? After all, data generation can still be set up manually, or you could rely on a simpler API.

For me, the answer is clear: all this "hoop-jumping" is absolutely worth the effort. I believe the time I spent digging through TS documentation and writing/testing the library was justified, and as for its usage? I now include it in all my projects by default.

Moreover, the transformer technology itself is still evolving. Over time, the TypeScript build ecosystem may make the connection process simpler than it is now.

Conclusion

In the end, I arrived at a form of testing that closely resembles the logic I used to write in C#.



typescript
it('number of lines is correct', () => {
const items = Forger.create()!;
//
const result = service.convert(items);
//
should().array(result).length(items.length);
});




`ListItem` here is a complex object with numerous fields, but since its content isn’t relevant within the context of the test, it’s enough to populate it with any values just to ensure the tested code works correctly, allowing the narrow logic of the test to be evaluated.

The library is actively used in my projects, and I hope it becomes a convenient tool for other developers looking to make their tests not only high-quality but also easier to write. Perhaps someone will even suggest new approaches and methods. You can explore its capabilities [here](https://forger.artstesh.ru/76a7eb56-cc26-481c-9302-cf8ddbd2002a), and feel free to share suggestions in the comments or via direct messages—I’d be happy to hear constructive ideas.