1 votes

Hash de documentos con hashlib, TypeError: object supporting the buffer API required

I am trying to make an application where I can indicate a file and its hashes (SHA1, SHA256, md5).

The problem is that at the time of extracting the hash it reports the following error, referring to a missing API:

Traceback (most recent call last): File "C:/Python/venv/Herramientas/Hashes.py", line 18, in objeto_hash = hashlib.sha1(objeto_fichero) TypeError: object supporting the buffer API required

The code is as follows:

import hashlib
ruta = 'C:\Python\pep8es.pdf'
objeto_fichero = open(ruta,mode = 'rb')
#cadena_input = input('Introduce la cadena para sacar su hash: ')
objeto_hash = hashlib.sha1(objeto_fichero)
hex_dig = objeto_hash.hexdigest()
print('sha1-> ',hex_dig)
objeto_hash = hashlib.sha256(objeto_fichero)
hex_dig = objeto_hash.hexdigest()
objeto_hash = hashlib.md5(b'Hola gente')
hex_dig = objeto_hash.hexdigest()

Can anyone tell me where it fails or where I can pull to fix the problem?


FJSevilla Points 29084

Both the constructors of the algorithms as well as the method update require as stated in the error objects bytes-like that support Buffer protocol .

You must therefore pass a byte string with the contents of the file, not the file itself ( _io.BufferedReader ), for example using the method read :

import hashlib

ruta = 'C:/Python/pep8es.pdf'

with open(ruta, mode='rb') as objeto_fichero:
    content = objeto_fichero.read()
    sha1_hash = hashlib.sha1(content)
    sha256_hash = hashlib.sha256(content)
    md5_hash = hashlib.md5(content)

print('sha1-> ', sha1_hash.hexdigest())
print('md5->', md5_hash.hexdigest())

Since read loads the entire contents of the file into memory, if you are going to apply it on very large files and want to avoid possible problems with RAM, you can read the file in fragments of the size you want and use the method update :

import hashlib

BUFFSIZE = 131072  # Tamaño del fragmento en bytes
ruta = 'C:/Python/pep8es.pdf'

sha1_hash = hashlib.sha1()
sha256_hash = hashlib.sha256()
md5_hash = hashlib.md5()

with open(ruta, 'rb') as objeto_fichero:
    buff = objeto_fichero.read(BUFFSIZE)
    while buff:
        buff = objeto_fichero.read(BUFFSIZE)

print('sha1-> ', sha1_hash.hexdigest())
print('md5->', md5_hash.hexdigest())

A string returned by input in Python 3 is also not valid as an argument since they are objects str with UTF-8 encoding by default, to solve this you just need to encode the string to get a bytes :

import hashlib

cadena_input = input('Introduce la cadena para sacar su hash: ')
cadena_cod = cadena_input.encode("UTF-8")
sha1_hash = hashlib.sha1(cadena_cod)
print('sha1-> ', sha1_hash.hexdigest())

Remember to always close a file when you stop using it explicitly with close() or by using the context manager with with not only for the sake of good practice, but also because leaving this to the garbage collector may provoke undesired behavior...

0 votes

first of all thanks for the answer, trying to see the code I have a question I've been learning python for a short time and I have found the use of with junto while If so, could you please tell me what you are in charge of? with or where can I see information about it? As well as the use of buffering, I do not understand the use you make of the official documentation.

0 votes

with is not directly related to the cycle while , with is used with objects that support the context manager protocol and guarantees that one or more statements will be executed automatically. In this particular case it takes care of properly closing the file for us. Take a look at this question: What is the "with" keyword used for and how does it work in Python? or official documentation The with statement for more information. Best regards.


HolaDevs is an online community of programmers and software lovers.
You can check other people responses or create a new question if you don't find a solution

Powered by: