We are developing an API with LUMEN.
Today we had a confused problem with getting the collection of our "TimeLog"-model.
We just wanted to get all time logs with additional informationen from the board model and task model.
In one row of time log we had a board_id and a task_id. It is a 1:1 relation on both.
This was our first code for getting the whole data. This took a lot of time and sometimes we got a timeout:
BillingController.php
public function byYear() {
$timeLog = TimeLog::get();
$resp = array();
foreach($timeLog->toArray() as $key => $value) {
if(($timeLog[$key]->board_id && $timeLog[$key]->task_id) > 0 ) {
array_push($resp, array(
'board_title' => isset($timeLog[$key]->board->title) ? $timeLog[$key]->board->title : null,
'task_title' => isset($timeLog[$key]->task->title) ? $timeLog[$key]->task->title : null,
'id' => $timeLog[$key]->id
));
}
}
return response()->json($resp);
}
The TimeLog.php where the relation has been made.
public function board()
{
return $this->belongsTo('App\Board', 'board_id', 'id');
}
public function task()
{
return $this->belongsTo('App\Task', 'task_id', 'id');
}
Our new way is like this:
BillingController.php
public function byYear() {
$timeLog = TimeLog::
join('oc_boards', 'oc_boards.id', '=', 'oc_time_logs.board_id')
->join('oc_tasks', 'oc_tasks.id', '=', 'oc_time_logs.task_id')
->join('oc_users', 'oc_users.id', '=', 'oc_time_logs.user_id')
->select('oc_boards.title AS board_title', 'oc_tasks.title AS task_title','oc_time_logs.id','oc_time_logs.time_used_sec','oc_users.id AS user_id')
->getQuery()
->get();
return response()->json($timeLog);
}
We deleted the relation in TimeLog.php, cause we don't need it anymore. Now we have a load time about 1 sec, which is fine!
There are about 20k entries in the time log table.
My questions are:
- Why is the first method out of range (what causes the timeout?)
- What does getQuery(); exactly do?
If you need more information just ask me.
--First Question--
One of the issues you might be facing is having all those huge amount of data in memory, i.e:
$timeLog = TimeLog::get();
This is already enormous. Then when you are trying to convert the collection to array:
- There is a loop through the collection.
- Using the
$timeLog->toArray()
while initializing the loop based on my understanding is not efficient (I might not be entirely correct about this though)
- Thousands of queries are made to retrieve the related models
So what I would propose are five methods (one which saves you from hundreds of query), and the last which is efficient in returning the result as customized:
Since you have many data, then chunk
the result ref: Laravel chunk so you have this instead:
$timeLog = TimeLog::chunk(1000, function($logs){
foreach ($logs as $log) {
// Do the stuff here
}
});
Other way is using cursor (runs only one query where the conditions match) the internal operation of cursor as understood is using Generators.
foreach (TimeLog::where([['board_id','>',0],['task_id', '>', 0]])->cursor() as $timelog) {
//do the other stuffs here
}
This looks like the first but instead you have already narrowed your query down to what you need:
TimeLog::where([['board_id','>',0],['task_id', '>', 0]])->get()
Eager Loading would already present the relationship you need on the fly but might lead to more data in memory too. So possibly the chunk method would make things more easier to manage (even though you eagerload related models)
TimeLog::with(['board','task'], function ($query) {
$query->where([['board_id','>',0],['task_id', '>', 0]]);
}])->get();
You can simply use Transformer
- With transformer, you can load related model, in elegant, clean and more controlled methods even if the size is huge, and one greater benefit is you can transform the result without having to worry about how to loop round it
You can simply refer to this answer in order to perform a simple use of it. However incase you don't need to transform your response then you can take other options.
Although this might not entirely solve the problem, but because the main issues you face is based on memory management, so the above methods should be useful.
--Second question--
Based on Laravel API here You could see that:
It simply returns the underlying query builder instance. To my observation, it is not needed based on your example.
UPDATE
For question 1, since it seems you want to simply return the result as response, truthfully, its more efficient to paginate this result. Laravel offers pagination The easiest of which is SimplePaginate which is good. The only thing is that it makes some few more queries on the database, but keeps a check on the last index; I guess it uses cursor
as well but not sure. I guess finally this might be more ideal, having:
return TimeLog::paginate(1000);
I have faced a similar problem. The main issue here is that Elloquent is really slow doing massive task cause it fetch all the results at the same time so the short answer would be to fetch it row by row using PDO fetch.
Short example:
$db = DB::connection()->getPdo();
$query_sql = TimeLog::join('oc_boards', 'oc_boards.id', '=', 'oc_time_logs.board_id')
->join('oc_tasks', 'oc_tasks.id', '=', 'oc_time_logs.task_id')
->join('oc_users', 'oc_users.id', '=', 'oc_time_logs.user_id')
->select('oc_boards.title AS board_title', 'oc_tasks.title AS task_title','oc_time_logs.id','oc_time_logs.time_used_sec','oc_users.id AS user_id')
->toSql();
$query = $db->prepare($query->sql);
$query->execute();
$logs = array();
while ($log = $query->fetch()) {
$log_filled = new TimeLog();
//fill your model and push it into an array to parse it to json in future
array_push($logs,$log_filled);
}
return response()->json($logs);