0

I have a really huge array, it may contain millions of objects and looks like

    srcArray = [{
        'name': 'AAA',
        'keyA': true,
        'keyB': 'blahblah'
    }, 
    //...
    {
        'name': 'ZZZ',
        'keyA': false,
        'keyB': 'testString'
    }];

I will have to find every matching object and put it into another array. The problem is we cannot put this single array into memory because it's too big to deal with. The idea is to split it into small parts, get the first part, do something, then replace it by the next piece ... and so on, till the last piece. See this example (for three very small pieces):

Part 1, file part1.js:

    srcArray = [{
        'name': 'A1',
        'boolA': false,
        'strB': 'testString'
    }, {
        'name': 'B1',
        'boolA': false,
        'strB': 'blahblah'
    }, {
        'name': 'C1',
        'boolA': true,
        'strB': 'blahblah'
    }];

Part 2, file part2.js:

    srcArray = [{
        'name': 'A2',
        'boolA': false,
        'strB': 'blahblah'
    }, {
        'name': 'B2',
        'boolA': true,
        'strB': 'testString'
    }, {
        'name': 'C2',
        'boolA': false,
        'strB': 'blahblah'
    }];

Part 3, file part3.js:

    srcArray = [{
        'name': 'A3',
        'boolA': true,
        'strB': 'blahblah'
    }, {
        'name': 'B3',
        'boolA': false,
        'strB': 'blahblah'
    }, {
        'name': 'C3',
        'boolA': false,
        'strB': 'testString'
    }];

The code:

<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Test</title>
        <script src="jquery.min.js"></script>
        <script type="text/javascript">
            srcArray = [];
            found = [];
        </script>
        <script>
            $(document).ready(function () {
                var loadScripts = function (scripts) {
                        var result = function () {
                                delete srcArray; 
                                srcArray = undefined;
                                console.log(found);
                            }
                        var func = function (i) {
                                if (i >= scripts.length) return result();
                                var head = document.getElementsByTagName("head")[0];
                                var scr = document.createElement('script');
                                scr.src = scripts[i];
                                var f = false;
                                scr.onload = scr.onreadystatechange = function () {
                                    if (!f && (!scr.readyState || scr.readyState == "loaded" || scr.readyState == "complete")) {
                                        f = true;
                                        search('testString', srcArray);
                                        setTimeout(function () {
                                            func(++i)
                                        }, 50);
                                        scr.onload = scr.onreadystatechange = null;
                                        if (head && scr.parentNode) {
                                            head.removeChild(scr);
                                        }
                                    }
                                }
                                return head.insertBefore(scr, head.firstChild);
                            }
                        func(0);
                    }
                function search(s, arr) {
                    for (var i = arr.length; i--;) {
                        for (key in arr[i]) {
                            if (typeof (arr[i][key]) === 'string' && arr[i].hasOwnProperty(key) && arr[i][key].indexOf(s) > -1) found.push(arr[i]);
                        }
                    }
                    return found;
                };
                $('.click').click(function () {
                    delete found; 
                    found = undefined;
                    srcArray = [];
                    setTimeout(function () {
                        found = [];
                        loadScripts(["part1.js", "part2.js", "part3.js"])
                    }, 50);
                });
            });
        </script>
    </head>
    <body>
        <span class="click">click me</span>
    </body>
</html>

Yes, I realize that global variables are ugly, but I can't see another way. I tested this code with 10 parts (each array contained 250k objects, that is, 10 parts contained 2.5 million objects, total data size is about 260MB). It took about 20 seconds to complete this test, but I don't care about the time, I do care about memory: Firefox was the best occupying 140MB RAM after 4 clicks (it means that the browser searched within more than 1 GB of script data! a very good result!), Chrome and Opera 11.52 were much worse, and IE9 just failed at the second click with about 1GB of RAM after the first click.

Question 1. Is it possible to solve this kind of problem by Javascript? Is it possible to search through a huge data and avoid memory leak?

Question 2. How to stop the process? That is, when I click some #stop element, the function must stop and release that found array (it will contain as much objects as the function have already found).

4

0 回答 0