php - Best way to create nested array from tables: multiple queries/loops VS single query/loop style -

say have 2 tables, can "merge" , represent in single nested array.

i'm wandering best way that, considering:

efficiency
best-practice
db/server-side usage trade-off
what should in real life
same case 3, 4 or more tables can "merged" way

the question server-side/relational-db.

2 simple ways thinking (if have others, please suggest! notice i'm asking simple server-side , relational-db, please don't waste time explaining why shouldn't use kind of db, use mvc design, etc., etc. ...):

2 loops, 5 simple "select" queries
1 loop, 1 "join" query

i've tried give simple , detailed example, in order explain myself & understand better answers (though how write code and/or finding possible mistakes not issue here, try not focus on that...)

sql scripts creating , inserting data tables

create table persons (     id int not null auto_increment,     fullname varchar(255),     primary key (id) );  insert persons (fullname) values ('alice'), ('bob'), ('carl'), ('dan');  create table phonenumbers (     id int not null auto_increment,     personid int,     phonenumber varchar(255),     primary key (id) );  insert phonenumbers (personid, phonenumber) values ( 1, '123-456'), ( 1, '234-567'), (1, '345-678'), (2, '456-789'), (2, '567-890'), (3, '678-901'), (4, '789-012');

a json representation of tables after "merged" them:

[   {     "id": 1,     "fullname": "alice",     "phonenumbers": [       "123-456",       "234-567",       "345-678"     ]   },   {     "id": 2,     "fullname": "bob",     "phonenumbers": [       "456-789",       "567-890"     ]   },   {     "id": 3,     "fullname": "carl",     "phonenumbers": [       "678-901"     ]   },   {     "id": 4,     "fullname": "dan",     "phonenumbers": [       "789-012"     ]   } ]

pseudo code 2 ways:

query: "select id, fullname persons" personlist = new list<person>() foreach row x in query result:     current = new person(x.fullname)     "select phonenumber phonenumbers personid = x.id"     foreach row y in query result:         current.phonenumbers.push(y.phonenumber)     personlist.push(current)         print personlist

query: "select persons.id, fullname, phonenumber persons             left join phonenumbers on persons.id = phonenumbers.personid" personlist = new list<person>() current = null previouseid = null foreach row x in query result:     if ( x.id !=  previouseid )         if ( current != null )             personlist.push(current)             current = null         current = new person(x.fullname)     current.phonenumbers.push(x.phonenumber) print personlist

code implementation in php/mysql:

/* persons */ $result = mysql_query("select id, fullname persons");  $personsarray = array(); //create array //loop persons while ($row = mysql_fetch_assoc($result)) {     //add new person     $current = array();     $current['id'] = $row['id'];     $current['fullname'] = $row['fullname'];      /* add person phone-numbers */     $id = $current['id'];     $sub_result = mysql_query("select phonenumber phonenumbers personid = {$id}");     $phonenumbers = array();     while ($sub_row = mysql_fetch_assoc($sub_result))     {         $phonenumbers[] = $sub_row['phonenumber']);     }     //add phonenumbers array person     $current['phonenumbers'] = $phonenumbers;      //add person final result array     $personsarray[] = $current; }  echo json_encode($personsarray);

/* persons , phone-numbers in single query */ $sql = "select persons.id, fullname, phonenumber persons             left join phonenumbers on persons.id = phonenumbers.personid"; $result = mysql_query($sql);   $personsarray = array(); /* init temp vars save current person's data */ $current = null; $previouseid = null; $phonenumbers = array(); while ($row = mysql_fetch_assoc($result)) {     /*        if current id different previous id:        you've got new person.        save previous person (if such exists),        , create new 1     */     if ($row['id'] != $previouseid )     {         // in first iteration,         // current (previous person) null,         // don't add         if ( !is_null($current) )         {             $current['phonenumbers'] = $phonenumbers;             $personsarray[] = $current;             $current = null;             $previouseid = null;             $phonenumbers = array();         }          // create new person         $current = array();         $current['id'] = $row['id'];         $current['fullname'] = $row['fullname'];         // set current previous id         $previouseid = $current['id'];     }      // add phone-number      // current phone-number list     $phonenumbers[] = $row['phonenumber'];     } }  // don't forget add last person (saved in "current") if (!is_null($current))     $personsarray[] = $current);  echo json_encode($personsarray);

p.s. link example of different question here, tried suggest second way: tables single json

preliminary

first, thank putting effort explaining problem, , formatting. great see clear doing, , asking.

but must noted that, in itself, forms limitation: fixed on notion correct solution, , small correction or guidance, work. incorrect. must ask give notion up, take big step back, , view (a) whole problem , (b) answer without notion.

the context of answer is:

all explicit considerations have given, important, not repeat
the 2 important of is, best practice , what in real life

this answer rooted in standards, higher order of, or frame of reference for, best practice. have done in real life since 1990, meaning since 1990, have never had need write code such yours. commercial client/server world does, or should doing.

this issue, whole problem space, becoming common problem. give full consideration here, , answer question well. therefore might contain tiny bit more detail require. if does, please forgive this.

consideration

the database server-based resource, shared many users. in online system, database changing. contains 1 version of truth (as distinct 1 fact in 1 place, separate, normalisation issue) of each fact.
- the fact mickey mouse nonsqls not have server architecture, , therefore notion of server in such software false , misleading, separate noted points.
as understand it, json , json-like structures required "performance reasons", precisely because "server" doesn't, cannot, perform server. concept cache data on each (every) client, such not fetching "server" time.
- this opens stinking can of worms. if not design , implement properly, worms overrun app , stench kill you.
- such implementation gross violation of client/server architecture, allows simple code on both sides, , appropriate deployment of software , data components, such implementation times small, , efficiency high.
- further, such implementation requires substantial implementation effort, , complex, consisting of many parts. each of parts must appropriately designed.
- the web, , many books written in subject area, provide cesspool of methods, marketed on basis of supposed simplicity; ease; anyone-can-do-anything; freeware-can-do-anything; etc. there not scientific basis of proposals.

non-architecture & sub-standard

as evidenced, have learned that marketed mythology fraudulent. have encountered one problem, one instance that advice false. solve 1 problem, next problem, not apparent right now, exposed. notions never-ending set of problems.

i not enumerate false notions these pretend-experts (in reality, circus freaks ignorant of technology) market. trust progress through answer, notice 1 after other marketed notion false.

the 2 bottom lines are:

the notions violate architecture , design standards, namely client/server architecture; open architecture; engineering principles; , lesser in particular problem, database design principles.
which leads people you, trying honest job, being defrauded, tricked, seduced, implementing simple notions, turn massive implementations. implementations never quite work, require substantial ongoing maintenance, , replaced, wholesale.

architecture

the central principle being violated is, never duplicate anything. moment have location data duplicated (due caching or replication or 2 separate monolithic apps, etc), create duplicate will go out of synch in online situation. principle avoid doing that.

sure, serious third-party software, such gruntly report tool, design, may cache server-based data in client. note have put hundreds of man years implementing correctly, due consideration above. yours not such piece of software.

rather providing lecture on principles must understood, or evils , costs of each error, rest of answer provides requested what in real life, using correct architectural method (a step above best practice).

architecture 1

do not confuse

the data
must normalised

with

the result set
which, definition, flattened ("de-normalised" not quite correct) view of data.

the data, given normalised, not contain duplicate values; repeating groups. result set will contain duplicate values; repeating groups. pedestrian.

note notion of nested sets (or nested relations), being heavily marketed schizophrenics, based on precisely confusion.
for fortyfive years since advent of rm, have been unable differentiate base relations (for normalisation does apply) derived relations (for normalisation does not apply).
two of freaks mounting assault on definition of first normal form. assault on intellect. (if accepted) normalise insanity. 1nf foundation of other nfs, if insanity accepted, nfs damaged, demeaned, rendered value-less. result normalisation (sparsely defined in mathematical terms, understood science professionals) severely damaged, if not destroyed.

architecture 2

there centuries-old scientific or engineering principle, content (data) must separated control (program elements). because analysis; design; , implementation of 2 different. principle no less important in software sciences, has specific articulation.

in order keep brief (ha ha), instead of discourse, assume understand:

that there scientifically demanded boundary between data , program elements. mixing them results in complex objects error-prone , hard maintain.
- the confusion of principle has reached epidemic proportions in oo/orm world, consequences reach far , wide.
- only educated professionals avoid insanity. rest, great majority, accept insanity "normal", , spend lives fixing problems not have.
the architectural superiority, great value, of data being both stored , presented in tabular form per dr e f codd's relational model. there specific rules normalisation of data.
and importantly, can determine when people in mad house, write , market books, advise non-relational or anti-relational methods.

architecture 3

if cache data on client:

cache absolute minimum.

that means cache data not change in online environment. means reference , lookup tables only, tables populate higher level classifiers, drop-downs, etc.
currency

for every table cache, must have method of (a) determining cached data has become stale, compared 1 version of truth exists on server, , (b) refreshing server, (c) on table-by-table basis.

typically, involves background process executes every (e) 5 minutes, queries max updated datetime each cached table on client vs datetime on server, , if changed, refreshes table, and child tables, dependent on changed table.

that, of course, requires have updateddatetime column on every table. not burden, because need oltp acid transactions anyway (if have real database, instead of bunch of sub-standard files).

which means, never replicate, coding burden prohibitive.

architecture 4

in sub-commercial, non-server world, understand freaks advise reverse (insane people contradict sanity), caching of "everything".

that way programs pusgresql, produced cohorts in same asylum, can used in multi-user system, way can spread cancer.
you pay for: pay peanuts, monkeys; pay zero, zero.

the corollary architecture 3 is, if cache data on client, not cache tables change frequently. these transaction , history tables. notion of caching such tables, or tables, on client bankrupt.

in genuine client/server deployment, due use of applicable standards, each data window, app should query rows required, particular need, @ particular time, based on context or filter values, etc. app should never load entire table.

if same user using same window inspected contents, 15 minutes after first inspection, data 15 mins out of date.

for freeware/shareware/vapourware platforms, define absence of server architecture, , result, performance non-existent, sure, have cache more minimum tables on client.
if that, must take above account, , implement correctly, otherwise app broken, , stench drive users seek termination. if there more 1 user, have same cause, , form army.

architecture 5

now how cache chosen tables on client.

note databases grow, extended.

if system broken, failure, grow in small increments, , require lot of effort.
if system small success, grow exponentially.
if system (each of database, , app, separately) designed , implemented well, changes easy, bugs few.

therefore, components in app must designed properly, comply applicable standards, , database must normalised. in turn minimises effect of changes in database, on app, , vice versa.

the app consist of simple, not complex, objects, easy maintain , change.
for data cache on client, use arrays of form: multiple instances of class in oo platform; datawindows (tm, google it) or similar in 4gl; simple arrays in php.

(aside. note gravely, people in situations such yours produce in 1 year, professional providers such produce in 1 week, using commercial sql platform, commercial 4gl, , complying architecture , standards.)

architecture 6

so let's assume understand above, , appreciate value, particularly architecture 1 & 2.

if don't, please stop here , ask questions, not proceed below.

now have established full context, can address crux of problem.

in arrays in app, why on earth store flattened views of data ?
- and consequently mess with, , agonise over, problems
instead of storing copies of normalised tables ?

answer

never duplicate can derived. architectural principle, not limited normalisation in database.
never merge anything.

if do, creating:
- data duplication, , masses of it, on client. client not fat , slow, anchored floor ballast of duplicated data.
- additional code, unnecessary
- complexity in code
- code fragile, have change.
that precise problem suffering, consequence of method, know intuitively wrong, there must better way. know generic , common problem.

note method (the poison marketed), code, constitutes mental anchor you. @ way have formatted , presented beautifully: of importance you. reluctant inform of this.
- which reluctance overcome, due earnest , forthright attitude, , knowledge did not invent method, followed "teachers" are, evidenced, totally ignorant of relevant science, market insanity; non-science; nonsense, "science".
in each code segment, @ presentation time, , when required:

a. in commercial client/server context
execute query joins simple, normalised, unduplicated tables, , retrieves qualifying rows. thereby obtaining current data values. user never sees stale data. here, views (flattened views of normalised data) used.

b. in sub-commercial non-server context
create temporary result-set array, , join simple, unduplicated, arrays (copies of tables cached), , populate qualifying rows, source arrays. currency of maintained background process.
- use keys form joins between arrays, in same way keys used form joins in relational tables in database.
- destroy components when user closes window.
- a clever version eliminate result-set array, , join source arrays via keys, , limit result qualifying rows.

separate being architectural insanity, nested arrays or nested sets or json or json-like structures not required. consequence of confusing architecture 1 principle.

if choose use such structures, use them only temporary result-set arrays.

last, trust discourse demonstrates n tables non-issue. more important, m levels deep in data hierarchy, "nesting", non-issue.

answer 2

now have given full context (and not before), removes implications in question, , makes generic, kernel one.

the question server-side/relational-db. [which better]:

2 loops, 5 simple "select" queries

1 loop, 1 "join" query

the detailed examples have given not accurately described above. accurate descriptions is:

your option 1 2 loops, each loop loading each array 1 single-table select query per loop (executed n x m times ... outermost loop, only, single execution)
your option 2 1 joined select query executed once followed 2 loops, each loop loading each array

for commercial sql platforms, neither, because not apply.

the commercial sql server set-processing engine. use 1 query whatever joins required, returns result set. never step through rows using loop, reduces set-processing engine pre-1970's isam system. use view, in server, since affords highest performance , code in 1 place.

however, non-commercial, non-server platforms, where:

your "server" not set-processing engine ie. returns single rows, therefore have fetch each row , fill array, manually or
your "server" not provide client/server binding, ie. not provide facilities on client bind incoming result set receiving array, , therefore have step through returned result set, row row, , fill array, manually,

as per example then, answer is, large margin, option 2.

please consider carefully, , comment or ask questions.

response comment

say need print json (or other html page) stout (example: http response to: /allusersphonenumbers. it's example clarify i'm expecting get), should return json. have php function got 2 result sets (1). should print json - how should that?. report employee month salary whole year, , one. 1 way or anther, need gather information , represent in "join"ed representation

perhaps not clear enough.

basically, not use json. unless absolutely have to. means sending system requires it, means receiving system, , demand, very, very, stupid.
make sure system doesn't make such demands on others.
keep data normalised. both in database, , in whatever program elements write. means (in example) use 1 select per table or array. loading purposes, can refer , inspect them @ point in program.
when need join, understand is:
- a result-set; derived relation; view
- therefore temporary, exists duration of execution of element, only
a. tables, join them in usual manner, via keys. 1 query, joining 2 (or more) tables.

b. arrays, join arrays in program, same way join tables in database, via keys.
for example have given, response request, first understand category [4], , fulfil it.

why consider json ???
has json got ???

json misunderstood , people interested in wow factor. solution looking problem. unless have problem has no value. check these 2 links:
copter - json
stackoverflow - json

now if understand that, incoming feeds. never outgoing. further, requires parsing, deconstructing, etc, before can used.

recall:

i need gather information , represent in "join"ed representation

yes. pedestrian. joined not mean jsoned.

in example, receiver expecting flattened view (eg. spreadsheet), cells filled, , yes, users more 1 phonenumber, user details repeated on second nad subsequent result-set row. kind of print, eg. debugging, want flattened view. a:

    select ... person join phonenumber

and return that. or if fulfil request arrays, join person , phonenumber arrays, may require temporary result-set array, , return that.

please don't tell me should 1 user @ time, etc. etc.

correct. if tells regress procedural processing (ie. row row, in while loop), engine or program has set processing (ie. processes entire set in 1 command), marks them idiot. stop listening them completely.

i have stated, option 2 correct, option 1 incorrect. far or select concerned.

on other hand, programming languages not have set-processing capability (ie. cannot print/set/inspect array in single command), or "servers" not provide client-side array binding, have write loops, 1 loop per depth of data hierarchy (in example, 2 loops, 1 person, , 1 phonenumber per user).

you have parse incoming json object.
you have load each array result set returned in option 2.
you have print each array result set returned in option 2.

response comment 2

iv'e ment have return result represented in nested version (lets i'm printing report page), json example such representation.

i don't think understand reasoning , conclusions have provided in answer.

for printing , displaying, never nest. print flattened view, rows returned select per option 2. have been doing, when printing or displaying data relationally, 31 years. easier read, debug, search, find, fold, staple, mutilate. cannot do nested array, except @ it, , gee interesting.

i giving guidance, such can write required code, not working. looking have write code you.

code

caveat

i prefer take code , modify it, actually, looking @ code, not written or structured, cannot reasonably modified. second, if use that, bad teaching tool. have give fresh, clean code, otherwise not learn correct methods.

this code examples follow advice, not going repeat. , way beyond original question.

query & print

your request, using option 2. 1 select executed once. followed 1 loop. can "pretty up" if like.

Search This Blog

Lix