c - Saving regex matched strings to an array of strings -
so im starting pick c end goal write function searches string regular expression , returns array of matches.
the biggest problem i'm having saving strings memory can return or referenced in pointer passed in parameter.
i'd way tell how many matches there c# equivalent; if(matches.count() > 0) { /* have match! */ } resulting string of each match group depending on pattern i'll pass in.
i know isn't correct , has other errors in practice here code walked away trying figure out reading on pointers, structs, char arrays..etc
typedef struct { char *match; } matches; int main() { regex_t regex; int reti; char msgbuf[100]; int max_matches = 10; regmatch_t m[max_matches]; char str[] = "hello world"; reti = regcomp(®ex, "(hello) (world)", reg_extended); if( reti ) { fprintf(stderr, "could not compile regex\n"); exit(1); } reti = regexec(®ex, str, (size_t) max_matches, m, 0); if( !reti ) { puts("match"); } else if( reti == reg_nomatch ) { puts("no match"); } else { regerror(reti, ®ex, msgbuf, sizeof(msgbuf)); fprintf(stderr, "regex match failed: %s\n", msgbuf); exit(1); } char *p = str; int num_of_matches = 0; matches *matches; int = 0; for(i = 0; < max_matches; i++) { if (m[i].rm_so == -1) break; int start = m[i].rm_so + (p - str); int finish = m[i].rm_eo + (p - str); if (i == 0) printf ("$& "); else printf ("$%d ", i); char match[finish - start + 1]; memcpy(match, str + start, finish - start); match[sizeof(match)] = 0; matches[i].match = match; //need access string in array outside of loop printf ("'%.*s' (bytes %d:%d)\n", (finish - start), str + start, start, finish); num_of_matches++; } p += m[0].rm_eo; for(i = 0; < num_of_matches; i++) { printf("'%s'\n", matches[i].match); } /* free compiled regular expression if want use regex_t again */ regfree(®ex); return 0; } just when thought got when matching "world" noticed when commented out printf statements last printf statement returning empty chars or random chars.
your problems memory issues c strings.
first, define array matches:
matches *matches; this defines pointer match structure, pointer uninitialised , doesn't point anywhere sensible. instead, should define array of matches:
matches matches[max_matches]; this give 10 (local) matches can access.
next, define local string hold match variable-length array (vla):
char match[finish - start + 1]; this time, have allocated enough space hold substring. char buffer local , gone when reach closing brace of for loop body. next pass through loop might use same memory. illegal access memory after loop.
one solution allocate memory on heap malloc:
char *match = malloc(finish - start + 1); note have release resources again later explicitly free.
you copy substring , end null character. however, when so, don't location of null character right:
match[sizeof(match)] = 0; sizeof compile-time operand tells how many bytes type of given expression occupies in memory. when used vla, sizeof(match) 1 after end of thatb buffer. use pointer allocated memory, sizeof size of pointer.
often sizeof confused strlen, here can't use strlen, because match not yet null-terminated strlen requires. know size of string, of yourse:
match[finish - start] = 0; you don't need pointer p, define:
int start = m[i].rm_so; int finish = m[i].rm_eo; so:
- make sure allocate memory when want store things.
- take care local memory isn't invalidated before access it. (the egregious example of return address of local array function. case less offensive, less visible.)
- long-lived memory can allocated
malloc. such memory isn't garbage collected, must explicitly freedfree. sizeofcompile-time operand. crutch needed raw memory functionsmalloc. (i've omittedsizeofhere, becausesizeof(char)guaranteed 1.)
isn't working strings in c fun?
Comments
Post a Comment