0

When I try to make and hash objects from a file, containing one million songs, I get a weird segmentation error after about 12000 succesfull hashes.

Anyone have any idea why this:

Segmentation fault: 11

happens when I run the program?

I have these classes for hashing the objects:

class Node():
    def __init__(self, key, value = None):
        self.key = key
        self.value = value

    def __str__(self):
        return str(self.key) + " : " + str(self.value)

class Hashtable():
    def __init__(self, hashsize, hashlist = [None]):
        self.hashsize = hashsize*2
        self.hashlist = hashlist*(self.hashsize)

    def __str__(self):
        return self.hashlist

    def hash_num(self, name):
        result = 0
        name_list = list(name)
        for letter in name_list:
            result = (result*self.hashsize + ord(letter))%self.hashsize
        return result

    def check(self, num):
        if self.hashlist[num] != None:
            num = (num + 11**2)%self.hashsize#Kolla här jättemycket!
            chk_num = self.check(num)#här med
            return chk_num#lär dig
        else:
            return num

    def check_atom(self, num, name):
        if self.hashlist[num].key == name:
            return num
        else:
            num = (num + 11**2)%self.hashsize
            chk_num = self.check_atom(num, name)#läs här
            return chk_num#läs det här

    def put(self, name, new_atom):
        node = Node(name)
        node.value = new_atom
        num = self.hash_num(name)
        chk_num = self.check(num)
        print(chk_num)
        self.hashlist[chk_num] = node

    def get(self, name):
        num = self.hash_num(name)
        chk_num = self.check_atom(num, name)
        atom = self.hashlist[chk_num]
        return atom.value

And I call upon the function in this code:

from time import *
from hashlist import *
import sys

sys.setrecursionlimit(1000000000)

def lasfil(filnamn, h):
    with open(filnamn, "r", encoding="utf-8") as fil:
        for rad in fil:
            data = rad.split("<SEP>")
            artist = data[2].strip()
            song = data[3].strip()
            h.put(artist, song)

def hitta(artist, h):
    try:
        start = time()
        print(h.get(artist))
        stop = time()
        tidhash = stop - start
        return tidhash
    except AttributeError:
        pass

h = Hashtable(1000000)
lasfil("write.txt", h)
4

1 回答 1

4

您遇到分段错误的原因是这一行:

sys.setrecursionlimit(1000000000)

我假设您添加它是因为您收到了RuntimeError: maximum recursion depth exceeded. 提高递归限制不会为调用堆栈分配更多内存,它只是推迟上述异常。如果将其设置得太高,解释器会耗尽堆栈空间并访问不属于它的内存,从而导致随机错误(可能是段错误,但理论上一切皆有可能)。

真正的解决方案是不使用无限递归。对于像平衡搜索树这样的东西,递归深度限制在几十个级别,没关系,但你不能用递归替换长循环。

此外,除非这是创建哈希表的练习,否则您应该只使用内置的dict. 如果这一个创建哈希表的练习,请认为这是一个暗示您的哈希表很糟糕的提示:它表明探测长度至少为 1000,更可能是几千。它最多应该只有几十个,最好是个位数。

于 2013-10-11T14:52:49.007 回答