Javascript based Search

One thing that I could not find for my website was a fully functional search. So I wrote one.

It works completely on the client side and uses a .json file generated by hexo-search.

I split up my code in four sections:

  1. The html for the search
  2. The start up of the search
  3. The search function
  4. The Range class I wrote for this
  5. And some miscellaneous functions and methods

At the end of the post is the complete code.

HTML and CSS

This is the HTML and CSS code for the search input. The showing and hiding is completely done by css, no javascript required. It used the :checked selector of a checkbox in combination with a label.

HTML Code []view raw
1
2
3
4
5
<input type="checkbox" id="searchCheckbox" name="searchCheckbox" class="nav-list-search-button">
<label for="searchCheckbox">
<div class="fakeSearchButton"></div>
<input type="search" class="nav-list-search">
</label>

CSS Code []view raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
header .nav-list-search-button {
display: none;
}
header .nav-list-search-button+label .fakeSearchButton {
display: inline-block;
cursor: pointer;
background: url(../images/search.png);
background-size: contain;
width: 12px;
height: 12px;
position: absolute;
right: 15px;
top: 25px;
}
header .nav-list-search-button+label .nav-list-search {
display: inline-block;
width: 0px;
padding: 2px 2px;
border: none;
outline: none;
border-bottom: 1px solid rgba(44,62,80,0);
-webkit-transition: border-color 400ms, width 400ms, padding 400ms;
transition: border-color 400ms, width 400ms, padding 400ms;
}
header .nav-list-search-button:checked+label .nav-list-search {
width: 120px;
padding-right: 18px;
border-bottom: 1px solid #2c3e50;
}

Start Up

This start up code is called once when the site is loading.
It gets the search.json and add an event listener to the search input.


Start Up Code []view raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// in this global variable the json object will be saved
var searchJson = null;
// This function loads the search.json, parses it and saves it in the searchJson variable.
function loadSearchJson() {
var req = new XMLHttpRequest();
req.addEventListener('load', function() {
searchJson = JSON.parse(this.responseText);
for(var i=0; i < searchJson.length; i++)
if(searchJson[i].title == "")
searchJson.splice(i--, 1);
});
req.open("GET", baseUrl + "search.json");
req.send();
}
// It is called immediately
loadSearchJson();
// When the DOM is ready the search input field is prepared.
window.addEventListener('DOMContentLoaded', function() {
// The text input gets an input handler that calls the search function with the input string
document.querySelector('.nav-list-search').addEventListener('input', function() {
search(this.value)
});
});

The Algorithm

The search algorithm takes the query and splits it up in parts with variable length. With these it can easily find the longest connected part of a query in a post. The longer the found part is, the higher is the score for this post.

Let’s take this query: "some special programming topic".

The first step is splitting up the query in all lengths form 1 to 4:
This would be the result: (look at the source)

The clue behind this splitting is that the score of a post is higher if it contains "special programming topic" in a row, rather than the words "special", "programming" and "topic" independently.

All the strings are then searched in the title and content, and the quantity of occurrences is used to calculate the score:
score += queryWordCount² * (occurrencesInTitle * 100 + occurrencesInContent)

The categories and tags are also checked. A matching category adds 2000 and a tag 1000 to the score.

When the score of all posts is calculated they are sorted with the highest score on top.

After that the first occurrence of all found keywords in each post are searched and highlighted. The content of the post is then cut +100 and -100 characters around the highlighted keywords.

At the end the method generates the html code that gets displayed and shows the search result.
This code was generated when I searched for keywords. You can see how it is highlighted in the text:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<ul class="home post-list search-result">
<li class="post-list-item">
<article class="post-block">
<h2 class="post-title">
<a class="post-title-link" href="http://blog.localhost/2016/07/29/javascript-search/">Javascript based Search</a>
</h2>
<div class="post-content">
<p>
ulated they are sorted with the highest score on top. after that the first occurrences of all found <span class="search-keyword">keywords</span> in each post are search and highlighted. the content of the post is then cut +100 and -100 characte
</p>
</div>
</article>
</li>
</ul>


Search Method []view raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
function search(query) {
// If the search json is not loaded yet do nothing.
if(searchJson == null)
return;
// Get the result list element and clear it.
var resultList;
if(resultList = document.querySelector('ul.home.post-list.search-result'))
resultList.innerHTML = '';
// If no query was entered 'unhide' the main content and paginator and return
if(query == "") {
if(x = document.querySelector('.main-content')) x.classList.remove('hide');
if(x = document.querySelector('.paginator')) x.classList.remove('hide');
return;
}
// .. else hide them.
if(x = document.querySelector('.main-content')) x.classList.add('hide');
if(x = document.querySelector('.paginator')) x.classList.add('hide');
// convert the query to lower case to make the search case-insensitive
query = query.toLowerCase();
// use the superSplit method to split the query in all possible combinations
var splitSplitQuery = query.superSplit(" ");
// calculate the score of each post
for(var i=0; i < searchJson.length; i++) {
var post = searchJson[i];
post.score = 0;
post.contentText = createElement({
name: 'div',
innerHTML: post.content.toLowerCase()
}).innerText;
// loop through all parts of the query
for(var j=0; j < splitSplitQuery.length; j++) {
var splitQuery = splitSplitQuery[j];
for(var k=0; k < splitQuery.length; k++) {
var queryPart = splitQuery[k];
var titleIndexes = post.title.toLowerCase().superIndexOf(queryPart);
post.score += Math.pow(splitSplitQuery.length - j, 2) * titleIndexes.length * 100
var contentIndexes = post.content.toLowerCase().superIndexOf(queryPart);
post.score += Math.pow(splitSplitQuery.length - j, 2) * contentIndexes.length;
// check for categories and tags only when fully split
if(j == splitSplitQuery.length - 1) {
for(var l=0; l < post.categories.length; l++) {
if(post.categories[l].toLowerCase().indexOf(queryPart) != -1)
post.score += 2000;
}
for(var l=0; l < post.tags.length; l++) {
if(post.tags[l].toLowerCase().indexOf(queryPart) != -1)
post.score += 1000;
}
}
}
}
}
// sort the posts according to their score
searchJson.sort(function(a, b) {
if(a.score < b.score)
return 1;
if(a.score > b.score)
return -1;
return 0;
});
// create the search result list element
if(!resultList) {
resultList = createElement({
name: 'ul',
classes: 'home post-list search-result',
parent: document.querySelector('section.container')
});
}
// create the html code for each post
for(var i=0; i < searchJson.length; i++) {
// don't show the post if the score is 0
if(searchJson[i].score == 0)
break;
var contentText = searchJson[i].contentText;
var ranges = [];
// search for the first appearance of each keyword in the posts content
// and add a range from 100 characters before and after the keyword
splitSplitQuery.lastElement().forEach(function(keyword) {
var index = contentText.indexOf(keyword);
ranges.push(new Range(Math.max(index - 100, 0), Math.min(index + keyword.length + 100, contentText.length)));
});
// combine the ranges
var combinedRanges = Range.combineArray(ranges);
var resultContentText = "";
// in each of the remaining ranges mark all keywords
for(var j=0; j < combinedRanges.length; j++) {
var part = contentText.substring(combinedRanges[j].start, combinedRanges[j].end);
var markRanges = [];
splitSplitQuery.lastElement().forEach(function(keyword) {
var indexes = part.superIndexOf(keyword);
for(var k=0; k < indexes.length; k++)
markRanges.push(new Range(indexes[k], indexes[k] + keyword.length));
});
var combinedMarkRanges = Range.combineArray(markRanges);
Range.sortArray(combinedMarkRanges);
var offset = 0;
for(var k=0; k < combinedMarkRanges.length; k++) {
part = part.insert(combinedMarkRanges[k].start + offset, "<span class=\"search-keyword\">");
offset += "<span class=\"search-keyword\">".length;
part = part.insert(combinedMarkRanges[k].end + offset, "</span>");
offset += "</span>".length;
}
resultContentText += "<p>" + part + "</p>\n";
}
// create the code for the post
createElement({
name: 'li',
classes: 'post-list-item',
parent: resultList,
childs: [createElement({
name: 'article',
classes: 'post-block',
childs: [
createElement({
name: 'h2',
classes: 'post-title',
childs: [createElement({
name: 'a',
classes: 'post-title-link',
href: searchJson[i].url,
innerText: searchJson[i].title,
})]
}),
createElement({
name: 'div',
classes: 'post-content',
innerHTML: resultContentText,
})
]
})]
});
}
// if no post was found display a "-1"
if(i == 0) {
createElement({
name: 'li',
classes: 'post-list-item',
parent: resultList,
childs: [createElement({
name: 'article',
classes: 'post-block',
childs: [createElement({
name: 'h3',
classes: 'nothing-found',
innerText: '-1'
})]
})]
});
}
}

Range Class

The Range class represents a range between two numbers. It simplifies the work with ranges by providing methods to combine ranges, check if ranges intersect or are equal and combine or sort an array of ranges.

This class is used to save the ranges of post content that should be shown in the result +100 and -100 characters around each keyword.


Range Class []view raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
var Range = function(start, end) {
this.start = start,
this.end = end
}
Range.prototype.intersects = function(range2) {
return (range2.end >= this.start && range2.start <= this.start) ||
(range2.start <= this.end && range2.end >= this.end) ||
(range2.start <= this.start && range2.end >= this.end) ||
(range2.start > this.start && range2.end < this.end);
}
Range.prototype.equals = function(range2) {
return (this.start == range2.start && this.end == range2.end);
}
Range.prototype.combine = function(range2) {
return new Range(Math.min(this.start, range2.start), Math.max(this.end, range2.end));
}
Range.combineArray = function(ranges) {
if(ranges.length == 0)
return [];
var resultRanges = [ranges[0]];
for(var j=1; j < ranges.length; j++) {
var newRange = true;
for(var k=0; k < resultRanges.length; k++) {
if(ranges[j].equals(resultRanges[k])) {
newRange = false;
break;
}
if(ranges[j].intersects(resultRanges[k])) {
ranges[j] = ranges[j].combine(resultRanges[k]);
resultRanges.splice(k, 1);
resultRanges.push(ranges[j]);
for(var l=0; l < resultRanges.length; l++) {
if(l == k) continue;
if(resultRanges[l].intersects(resultRanges[k])) {
resultRanges[k] = resultRanges[k].combine(resultRanges[l]);
resultRanges.splice(l--, 1);
}
}
newRange = false;
break;
}
}
if(newRange)
resultRanges.push(ranges[j]);
}
return resultRanges;
}
Range.sortArray = function(ranges) {
ranges.sort(function(a, b) {
if(a.start > b.start)
return 1;
if(a.start < b.start)
return -1;
return 0;
});
}

Miscellaneous

This are some miscellaneous functions and methods I’m using to make it easier to work with javascript.


Create Element []view raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function createElement(params) {
if(!params.name)
return null;
var node = document.createElement(params.name);
if(x = params.id) node.id = x;
if(x = params.classes) node.className = x;
if(x = params.href) node.href = x;
if(x = params.innerText) node.innerText = x;
if(x = params.innerHTML) node.innerHTML = x;
if(x = params.parent) x.appendChild(node);
if(x = params.childs) {
if(typeof x.forEach !== 'function') x = [].slice.call(x);
x.forEach(function(child) {
node.appendChild(child);
});
}
return node;
}



Super Extentions []view raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Array.prototype.superJoin = function(seperator, start, length) {
var result = "";
for(var i = start; i < this.length-1 && i < start+length-1; i++)
result += this[i] + seperator;
result += this[i];
return result;
}
Array.prototype.lastElement = function() {
return this[this.length-1];
}
String.prototype.superSplit = function(seperator) {
var split = this.split(seperator);
for(var i=0; i<split.length; i++) {
if(split[i] == "")
split.splice(i--, 1);
}
var resultArray = [];
var max = Math.min(4, split.length);
for(var wordCount = max; wordCount > 1; wordCount--) {
resultArray[max-wordCount] = [];
for(var startIndex = 0; startIndex <= split.length - wordCount; startIndex++) {
resultArray[max-wordCount][startIndex] = split.superJoin(" ", startIndex, wordCount);
}
}
resultArray[max-1] = split;
return resultArray;
}
String.prototype.superIndexOf = function(stringValue) {
var index = -1;
var indexes = [];
if(stringValue == "")
return indexes;
do {
index = this.indexOf(stringValue, index + 1);
if(index != -1)
indexes.push(index);
} while(index != -1);
return indexes;
}
String.prototype.insert = function (index, string) {
if (index > 0)
return this.substring(0, index) + string + this.substring(index, this.length);
else
return string + this;
};

Complete Code

Here you can download the complete code either minified or not.

search_complete.js
search_complete.min.js