c - Saving regex matched strings to an array of strings -
so im starting pick c end goal write function searches string regular expression , returns array of matches.
the biggest problem i'm having saving strings memory can return or referenced in pointer passed in parameter.
i'd way tell how many matches there c# equivalent; if(matches.count() > 0) { /* have match! */ }
resulting string of each match group depending on pattern i'll pass in.
i know isn't correct , has other errors in practice here code walked away trying figure out reading on pointers, structs, char arrays..etc
typedef struct { char *match; } matches; int main() { regex_t regex; int reti; char msgbuf[100]; int max_matches = 10; regmatch_t m[max_matches]; char str[] = "hello world"; reti = regcomp(®ex, "(hello) (world)", reg_extended); if( reti ) { fprintf(stderr, "could not compile regex\n"); exit(1); } reti = regexec(®ex, str, (size_t) max_matches, m, 0); if( !reti ) { puts("match"); } else if( reti == reg_nomatch ) { puts("no match"); } else { regerror(reti, ®ex, msgbuf, sizeof(msgbuf)); fprintf(stderr, "regex match failed: %s\n", msgbuf); exit(1); } char *p = str; int num_of_matches = 0; matches *matches; int = 0; for(i = 0; < max_matches; i++) { if (m[i].rm_so == -1) break; int start = m[i].rm_so + (p - str); int finish = m[i].rm_eo + (p - str); if (i == 0) printf ("$& "); else printf ("$%d ", i); char match[finish - start + 1]; memcpy(match, str + start, finish - start); match[sizeof(match)] = 0; matches[i].match = match; //need access string in array outside of loop printf ("'%.*s' (bytes %d:%d)\n", (finish - start), str + start, start, finish); num_of_matches++; } p += m[0].rm_eo; for(i = 0; < num_of_matches; i++) { printf("'%s'\n", matches[i].match); } /* free compiled regular expression if want use regex_t again */ regfree(®ex); return 0; }
just when thought got when matching "world" noticed when commented out printf statements last printf statement returning empty chars or random chars.
your problems memory issues c strings.
first, define array matches:
matches *matches;
this defines pointer match structure, pointer uninitialised , doesn't point anywhere sensible. instead, should define array of matches:
matches matches[max_matches];
this give 10 (local) matches can access.
next, define local string hold match variable-length array (vla):
char match[finish - start + 1];
this time, have allocated enough space hold substring. char buffer local , gone when reach closing brace of for
loop body. next pass through loop might use same memory. illegal access memory after loop.
one solution allocate memory on heap malloc
:
char *match = malloc(finish - start + 1);
note have release resources again later explicitly free
.
you copy substring , end null character. however, when so, don't location of null character right:
match[sizeof(match)] = 0;
sizeof
compile-time operand tells how many bytes type of given expression occupies in memory. when used vla, sizeof(match)
1 after end of thatb buffer. use pointer allocated memory, sizeof
size of pointer.
often sizeof
confused strlen
, here can't use strlen
, because match
not yet null-terminated strlen
requires. know size of string, of yourse:
match[finish - start] = 0;
you don't need pointer p
, define:
int start = m[i].rm_so; int finish = m[i].rm_eo;
so:
- make sure allocate memory when want store things.
- take care local memory isn't invalidated before access it. (the egregious example of return address of local array function. case less offensive, less visible.)
- long-lived memory can allocated
malloc
. such memory isn't garbage collected, must explicitly freedfree
. sizeof
compile-time operand. crutch needed raw memory functionsmalloc
. (i've omittedsizeof
here, becausesizeof(char)
guaranteed 1.)
isn't working strings in c fun?
Comments
Post a Comment