Using marisa.cr for Efficient String Storage in Crystal
Bend

Bend @bendangelo

About: Ruby on Rails and Crystal

Joined:
Oct 26, 2024

Using marisa.cr for Efficient String Storage in Crystal

Publish Date: Apr 11
1 0

The marisa.cr Crystal shard gives you access to the powerful Marisa Trie data structure, perfect for storing and searching strings efficiently. Let's look at how to use it.

First, install the shard by adding it to your shard.yml:

dependencies:
  marisa:
    git: https://codeberg.org/bendangelo/marisa.cr.git
Enter fullscreen mode Exit fullscreen mode

Basic Usage

Create a trie and add some strings:

require "marisa"

trie = Marisa::Trie.new
trie.add("snow")
trie.add("snow cone")
trie << "ice cream" # same as add
Enter fullscreen mode Exit fullscreen mode

You can search for strings:

trie.search("ice").keys
# => ["ice", "ice cream"]
Enter fullscreen mode Exit fullscreen mode

Check if a string exists:

trie.include?("snow") # => true
Enter fullscreen mode Exit fullscreen mode

Working with Weights

Add strings with weights (useful for prioritization):

trie.add("ice", 1_f32)
trie.get_weight("ice") # => 1.0e-45_f32
Enter fullscreen mode Exit fullscreen mode

Bulk Operations

Add multiple strings at once:

trie.add_many(["icicle", "snowball"])
Enter fullscreen mode Exit fullscreen mode

Iterate through all keys:

trie.each do |key|
  puts key
end
Enter fullscreen mode Exit fullscreen mode

Saving and Loading

Save your trie to disk:

trie.save("winter.trie")
Enter fullscreen mode Exit fullscreen mode

Load it later:

trie = Marisa::Trie.new
trie.load("winter.trie")
Enter fullscreen mode Exit fullscreen mode

Specialized Tries

For binary data:

bytes_trie = Marisa::BytesTrie.new("one" => "1", "two" => "2")
bytes_trie["one"] # => "1"
Enter fullscreen mode Exit fullscreen mode

For integer values:

int_trie = Marisa::IntTrie.new("one" => 1, "two" => 2)
int_trie["one"] # => 1
int_trie.sum("one") # => 4 (sums all matching entries)
Enter fullscreen mode Exit fullscreen mode

Advanced Options

Customize your trie:

trie = Marisa::Trie.new(
  ["test"],
  [1.0_f32],
  binary: true,
  num_tries: 10,
  cache_size: :large,
  order: :weight
)
Enter fullscreen mode Exit fullscreen mode

The marisa.cr shard is a great choice when you need compact, efficient string storage with fast lookup capabilities. Give it a try for your next autocomplete or search feature!

Comments 0 total

    Add comment