Generating Meaningful Test Data Using Faker
Shwetabh Shekhar

Shwetabh Shekhar @shwetabh1

About: Lead Software Development Engineer based in Bengaluru. I love to create large-scale, distributed web/cloud applications with a particular passion for Go, Python, Java, JavaScript, Node.js, and AWS.

Location:
Bangalore
Joined:
Nov 2, 2020

Generating Meaningful Test Data Using Faker

Publish Date: Jan 17 '21
9 0

Whether you are building an API or writing tests for features that process massive datasets, meaningful test data is always a necessity. How do we fill this need? Faker is the answer.

What is Faker?

Faker is a library that can be used to generates a humongous amount of realistic fake data in Node.js and the browser. It is also available in a variety of other languages such as Python, Perl, Ruby, and C#. This article, however, will focus entirely on the Node.js flavor of Faker.

You can see a live demonstration of faker here.

Generating Data using Faker

Let's consider a use case where we want to store personal information in a CSV file with the following fields:

  • First Name
  • Last Name
  • Address(City, State, Zip Code, Country)
  • Phone Number
  • Email

And we need 100,000 such records (meaningful). Stop for a moment and think how would you have generated this? This is where Faker comes into play.

Generating CSV Datasets

Initialize your node project and Install faker:

npm i faker
Enter fullscreen mode Exit fullscreen mode

Include the dependencies in your project.

const faker = require('faker');
const fs = require('fs');
const _ = require('lodash');
Enter fullscreen mode Exit fullscreen mode

Define your headers for CSV based on the schema:

//define the headers of your csv file.
//define the object literal that would store the functions for each index
//faker generates new data for every call
let csvHeaders = {
    FIRST_NAME: faker.name.firstName(),
    LAST_NAME: faker.name.lastName(),
    STREET_ADDRESS: faker.address.streetAddress(),
    CITY: faker.address.city(),
    STATE: faker.address.state(),
    ZIP_CODE: faker.address.zipCode(),
    COUNTRY: faker.address.country(),
    VOICE_NUMBER: faker.phone.phoneNumber(),
    EMAIL_ADDRESS: faker.internet.email(),
}
Enter fullscreen mode Exit fullscreen mode

I am using streams, given we are writing input into output sequentially.

// open write stream
let stream = fs.createWriteStream("huge-csv.csv");
// write the header line.
stream.write(Object.keys(csvHeaders).toString()+ "\n");
Enter fullscreen mode Exit fullscreen mode

Create the 100,000 record CSV file.

//write the body
let csvBody = [];
for (let i = 0; i < 1000000; i++) {
    _.forEach(csvHeaders, function(value, key){
        //console.log(value);
        csvBody.push(value);
    })
    //console.log(csvBody.toString(), 'CSV BODY');
    stream.write(csvBody.toString()+ "\n");
    csvBody = [];


}
// close the stream
stream.end(); 
Enter fullscreen mode Exit fullscreen mode

Generating JSON Datasets

The process of generating the JSON file remains more or less the same with minor tweaks. I will leave that as an exercise. The code is available at my github repository.

Other Features and API Methods of Faker

I have only used a subset of the supported API methods in the above example. The faker.js can generate fake data for various other areas, including commerce, company, date, finance, image, random, etc.

const faker = require('faker');

# Jobs
let jobTitle = faker.name.jobTitle();
console.log(jobTitle);

let jobArea = faker.name.jobArea();
console.log(jobArea);

# dates

let futureDate = faker.date.future();
console.log(futureDate);

let recentDate = faker.date.recent();
console.log(recentDate);

let weekday = faker.date.weekday();
console.log(weekday);

# random values
let number = faker.random.number();
console.log(number);

let uuid = faker.random.uuid();
console.log(uuid);

let word = faker.random.word();
console.log(word);

let words = faker.random.words(6);
console.log(words);

# and so on...
Enter fullscreen mode Exit fullscreen mode

You can even use it directly in the browser as well.

<script src = "faker.js" type = "text/javascript"></script>
<script>
  var randomName = faker.name.findName(); // Caitlyn Kerluke
  var randomEmail = faker.internet.email(); // Rusty@arne.info
  var randomCard = faker.helpers.createCard(); // random contact card containing many properties
</script>
Enter fullscreen mode Exit fullscreen mode

Fake data is extremely useful when building and testing our application and Faker can help us with that. For a complete list of supported APIs, visit this link. Have you used Faker? How was your experience?

Comments 0 total

    Add comment