What does python sys getsizeof for string return? -


what sys.getsizeof return standard string? noticing value higher len returns.

i attempt answer question broader point of view. you're referring 2 functions , comparing outputs. let's take @ documentation first:

return length (the number of items) of object. argument may sequence (such string, bytes, tuple, list, or range) or collection (such dictionary, set, or frozen set).

so in case of string, can expect len() return number of characters.

return size of object in bytes. object can type of object. built-in objects return correct results, not have hold true third-party extensions implementation specific.

so in case of string (as many other objects) can expect sys.getsizeof() size of object in bytes. there no reason think should same number of characters.

let's have @ examples:

>>> first = "first" >>> len(first) 5 >>> sys.getsizeof(first) 42 

this example confirms size not same number of characters.

>>> second = "second" >>> len(second) 6 >>> sys.getsizeof(second) 43 

we can notice if @ string 1 character longer, size 1 byte bigger well. don't know if it's coincidence or not though.

>>> = first + second >>> print(together) firstsecond >>> len(together) 11 

if concatenate 2 strings, combined length equal sum of lengths, makes sense.

>>> sys.getsizeof(together) 48 

contrary might expect though, size of combined string not equal sum of individual sizes. still seems length plus something. in particular, something worth 37 bytes. need realize it's 37 bytes in particular case, using particular python implementation etc. should not rely on @ all. still, can take why it's 37 bytes (approximately) used for.

string objects in cpython (probably used implementation of python) implemented pystringobject. c source code (i use 2.7.9 version):

typedef struct {     pyobject_var_head     long ob_shash;     int ob_sstate;     char ob_sval[1];      /* invariants:      *     ob_sval contains space 'ob_size+1' elements.      *     ob_sval[ob_size] == 0.      *     ob_shash hash of string or -1 if not computed yet.      *     ob_sstate != 0 iff string object in stringobject.c's      *       'interned' dictionary; in case 2 references      *       'interned' object *not counted* in ob_refcnt.      */ } pystringobject; 

you can see there called pyobject_var_head, 1 int, 1 long , char array. char array contain 1 more character store '\0' @ end of string. this, along int, long , pyobject_var_head take additional 37 bytes. pyobject_var_head defined in another c source file , refers other implementation-specific stuff, need explore if want find out 37 bytes. plus, documentation mentions sys.getsizeof()

adds additional garbage collector overhead if object managed garbage collector.

overall, don't need know takes something (the 37 bytes here) answer should give idea why numbers differ , find more information should need it.


Comments

Popular posts from this blog

c++ - Difference between pre and post decrement in recursive function argument -

php - Nothing but 'run(); ' when browsing to my local project, how do I fix this? -

php - How can I echo out this array? -