Explanation of Ruby code for building Trie data structures
You're probably getting lost inside that mess of code which takes an approach that seems a better fit for C++ than for Ruby. Here's the same thing in a more concise format that uses a special case Hash for storage:
class Trie < Hash
def initialize
# Ensure that this is not a special Hash by disallowing
# initialization options.
super
end
def build(string)
string.chars.inject(self) do |h, char|
h[char] ||= { }
end
end
end
It works exactly the same but doesn't have nearly the same mess with pointers and such:trie = Trie.new
trie.build('dogs')
puts trie.inspect
Ruby's Enumerable
module is full of amazingly useful methods like inject
which is precisely what you want for a situation like this. ruby trie implementation reference issue
Ruby's string concat mutates the string and doesn't return a new string. You may want the + operator instead. So basically change the 2 lines inside collect's for-loop as per below:
stringn = string + letter
collect(node.hash[letter], stringn)
Also, you probably want to either always initialize @words
to empty in print
before calling collect
, or make it a local variable in print
and pass it to collect
. Are data structures used in higher level languages?
The main reason for using a data structure is not about garbage collection. But it is about storing data in a way that is efficient in some way. So what matters most is HOW you are organizing the data. Which is exactly what the language can't automatically figure out for you.So since these higher level languages
manage the memory for you, what would
you use data structures for?
Sure the high level language will come with several preloaded data structures (and you should 100% use these preloaded data structures when they are provided instead of making your own), but not all data structures are provided that you may need.
Data structures organize the storage of memory in some way so that the algorithms that run on them can be implemented giving efficient results.
For most tasks you wouldn't need to implement your own data structures. But this depends fully on what you are coding.
There are a lot of examples for using a binary tree, but not in common every day projects, for example you may need to implement huffman coding.I can understand the need for queues
and stacks but would you ever need to
use a binary tree in Ruby?
Other data structures can be used to have the space savings and fast lookups of using a trie, or you may need to store a LOT of data with fast lookup by using a btree. Several data structures have specific uses and are optimized for different things. Whether the language is modern or not and whether it has garbage collection or not doesn't change that.
The trend though, is that custom implemented data structures are coded less, and thought about less. A similar argument happens with common algorithms. In more modern languages, like LINQ you simply specify to sort. You don't actually say how to sort.
Match pattern in Ruby with Regexp
Here's a method to find the longest common prefix in an array.
def _lcp(str1, str2)
end_index = [str1.length, str2.length].min - 1
end_index.downto(0) do |i|
return str1[0..i] if str1[0..i] == str2[0..i]
end
''
end
def lcp(strings)
strings.inject do |acc, str|
_lcp(acc, str)
end
end
lcp [
'http://www.example.com?id=123456',
'http://www.example.com?id=234567',
'http://www.example.com?id=987654'
]
#=> "http://www.example.com?id="
lcp [
'http://www.example.com?id=123456',
'http://www.example.com?id=123457'
]
#=> "http://www.example.com?id=12345"
Finding words frequency of huge data in a database
Finding information on huge data is done by parallelizing it and use a cluster rather then a single machine.
What you are describing is a classic map-reduce problem, that can be handled using the following functions (in pseudo code):
map(doc):
for each word in doc:
emitIntermediate(word,"1")
reduce(list<word>):
emit(word,size(list))
The map reduce framework, which is implemented in many languages - allows you to easily scale the problem and use a huge cluster without much effort, taking care of failures and workers management for you.In here: doc is a single document, it usually assumes a collection of documents. If you have only one huge document, you can of course split it to smaller documents and invoke the same algorithm.
Ruby keep occurance count in ordered data structure
So here is where I ended up.. with a working solution. I used a normal array as a priority queue of sorts, so rather than having the ID of the object be the key, and the value how many times it's been accessed, I simple am storing the object ID in an array.
With an array of ID's, when it comes time to 'increment' I simply delete it from the array, and push it back on the end of the array - since arrays have 'implied' indexes the preserve order.
How to get the index positions (x,y) of the keys of a variably deep Trie in Ruby
Here's a simple recursive function that outputs the position of each key in the spreadsheet.
def to_coords hash, x = 0, y = 0
hash.each do |k, v|
puts "#{x},#{y} #{k}"
x = to_coords(v, x, y + 1)
end
return x + (hash.empty? ? 1 : 0)
end
For your example, this outputs0,0 Canada
0,1 Male
0,2 Children
1,2 Old
2,2 Teenager
3,1 Female
3,2 Children
4,2 Old
5,2 Teenager
6,0 France
6,1 Male
6,2 Children
7,2 Old
8,2 Teenager
9,1 Female
9,2 Children
10,2 Old
11,2 Teenager
You didn't give a full example of your input so this will need to be tweaked a bit to fit your application. The basic idea is that if you are at the bottom level (Children, Old, Teenager), then each key is just shifted over by one, hence the hash.empty? ? 1 : 0
. If you are not at the bottom level then iterating over the subhashes tells you what X value to use next.
Related Topics
Ruby How to Generate a Tree Structure Form Array
Include Module in All Minitest Tests Like in Rspec
Does The Rails Orm Limit The Ability to Perform Aggregations
How to Get Records Created at The Current Month
In Ruby, How to Be Warned of Duplicate Keys in Hashes When Loading a Yaml Document
Warning While Installing The Rails Plugin
Rails/Postgres, 'Foreign Keys' Stored in Array to Create 1-Many Association
Make Headless Browser Stop Loading Page
How to Access Parent/Sibling Module Methods
Why Am I Getting "Unable to Autoload Constant" with Rails and Grape
Many to Many Table with an Extra Column in Rails
Error Installing Rdoc Documentation: Incompatible Encoding Regexp Match
Without Converting to a String, How Many Digits Does a Fixnum Have
Filtering Sensitive Data with Vcr
Using Rails with Paperclip and Swfupload
How to Change "3 Errors Prohibited This Foobar from Being Saved" Validation Message in Rails