是什么将数千条记录到使用Django的SQLite表的有效途径?(What is an effici

2019-06-26 02:54发布

我一直在使用Django的ORM插入8000+记录到一个SQLite数据库。 此操作需约每分钟运行一次作为一个cronjob。
目前我使用的是用于遍历所有项目进行迭代,然后通过一个插入它们之一。
例:

for item in items:
    entry = Entry(a1=item.a1, a2=item.a2)
    entry.save()

什么是这样做的有效途径?

编辑:两个插入方法之间有一点比较。

如果没有commit_manually装饰(11245条记录):

nox@noxdevel marinetraffic]$ time python manage.py insrec             

real    1m50.288s
user    0m6.710s
sys     0m23.445s

使用commit_manually装饰(11245条记录):

[nox@noxdevel marinetraffic]$ time python manage.py insrec                

real    0m18.464s
user    0m5.433s
sys     0m10.163s

注: 测试脚本还会做一些其他的操作,除了插入数据库(下载一个ZIP文件,提取从ZIP压缩文件的XML文件,解析XML文件),因此需要执行的时间并不一定代表插入所需的时间记录。

Answer 1:

你想看看django.db.transaction.commit_manually

http://docs.djangoproject.com/en/dev/topics/db/transactions/#django-db-transaction-commit-manually

因此,这将是这样的:

from django.db import transaction

@transaction.commit_manually
def viewfunc(request):
    ...
    for item in items:
        entry = Entry(a1=item.a1, a2=item.a2)
        entry.save()
    transaction.commit()

这只会犯一次,而不是在每个省()。

在Django引入1.3上下文管理。 所以,现在你可以使用transaction.commit_on_success()以类似的方式:

from django.db import transaction

def viewfunc(request):
    ...
    with transaction.commit_on_success():
        for item in items:
            entry = Entry(a1=item.a1, a2=item.a2)
            entry.save()

在Django 1.4, bulk_create加入,让你创建你的模型对象的列表,然后提交他们的一次。

注:使用批量创建时的保存方法将不会被调用。

>>> Entry.objects.bulk_create([
...     Entry(headline="Django 1.0 Released"),
...     Entry(headline="Django 1.1 Announced"),
...     Entry(headline="Breaking: Django is awesome")
... ])

在Django 1.6, transaction.atomic介绍,现在打算以取代传统功能commit_on_successcommit_manually

从Django的有关原子的文档 :

原子是作为一个装饰可用两个:

from django.db import transaction

@transaction.atomic
def viewfunc(request):
    # This code executes inside a transaction.
    do_stuff()

作为一个上下文管理器:

from django.db import transaction

def viewfunc(request):
    # This code executes in autocommit mode (Django's default).
    do_stuff()

    with transaction.atomic():
        # This code executes inside a transaction.
        do_more_stuff()


Answer 2:

批量创建在Django 1.4可用:

https://django.readthedocs.io/en/1.4/ref/models/querysets.html#bulk-create



Answer 3:

看看这个 。 它意味着使用外的开箱即用的,仅在MySQL,但也有什么其他数据库做指针。



Answer 4:

你可能会更好批量加载的项目 - 准备一个文件,并使用批量加载工具。 这将是大大超过8000个个人插入更有效。



Answer 5:

你应该看看DSE 。 我写了DSE来解决这类问题(大量的插入或更新)。 使用Django的ORM是一条死胡同,你要做的是在普通的SQL和DSE需要照顾大部分是给你的。

托马斯



Answer 6:

要回答这个问题,特别是关于SQLite的,如要求,而我刚才已经证实,bulk_create确实提供了一个巨大的加速有使用SQLite的限制:“默认为一个批处理创建的所有对象,除了SQLite的当默认情况下是这样的,在最大999元查询变量使用“。

带引号的东西是从文档--- A-IV提供的链接。

我要补充的是, 这djangosnippets由何昕条目也似乎是为我工作。 这是一个小包装,打破了要处理成更小的批次,管理999个变量限制大批量。



Answer 7:

def order(request):    
    if request.method=="GET":
        # get the value from html page
        cust_name = request.GET.get('cust_name', '')
        cust_cont = request.GET.get('cust_cont', '')
        pincode = request.GET.get('pincode', '')
        city_name = request.GET.get('city_name', '')
        state = request.GET.get('state', '')
        contry = request.GET.get('contry', '')
        gender = request.GET.get('gender', '')
        paid_amt = request.GET.get('paid_amt', '')
        due_amt = request.GET.get('due_amt', '')
        order_date = request.GET.get('order_date', '')
        prod_name = request.GET.getlist('prod_name[]', '')
        prod_qty = request.GET.getlist('prod_qty[]', '')
        prod_price = request.GET.getlist('prod_price[]', '')

        # insert customer information into customer table
        try:
            # Insert Data into customer table
            cust_tab = Customer(customer_name=cust_name, customer_contact=cust_cont, gender=gender, city_name=city_name, pincode=pincode, state_name=state, contry_name=contry)
            cust_tab.save()
            # Retrive Id from customer table
            custo_id = Customer.objects.values_list('customer_id').last()   #It is return Tuple as result from Queryset
            custo_id = int(custo_id[0]) #It is convert the Tuple in INT
            # Insert Data into Order table
            order_tab = Orders(order_date=order_date, paid_amt=paid_amt, due_amt=due_amt, customer_id=custo_id)
            order_tab.save()
            # Insert Data into Products table
            # insert multiple data at a one time from djanog using while loop
            i=0
            while(i<len(prod_name)):
                p_n = prod_name[i]
                p_q = prod_qty[i]
                p_p = prod_price[i]

                # this is checking the variable, if variable is null so fill the varable value in database
                if p_n != "" and p_q != "" and p_p != "":
                    prod_tab = Products(product_name=p_n, product_qty=p_q, product_price=p_p, customer_id=custo_id)
                    prod_tab.save()
                i=i+1

            return HttpResponse('Your Record Has been Saved')
        except Exception as e:
            return HttpResponse(e)     

    return render(request, 'invoice_system/order.html')


Answer 8:

我建议使用纯SQL(不ORM),你可以插入一个插入多行:

insert into A select from B;

只要你想的结果匹配表A中的列,并且没有约束冲突,它得到,只要从你的SQL 说明B部分的选择可能是一样复杂。



Answer 9:

def order(request):    
    if request.method=="GET":
        cust_name = request.GET.get('cust_name', '')
        cust_cont = request.GET.get('cust_cont', '')
        pincode = request.GET.get('pincode', '')
        city_name = request.GET.get('city_name', '')
        state = request.GET.get('state', '')
        contry = request.GET.get('contry', '')
        gender = request.GET.get('gender', '')
        paid_amt = request.GET.get('paid_amt', '')
        due_amt = request.GET.get('due_amt', '')
        order_date = request.GET.get('order_date', '')
        print(order_date)
        prod_name = request.GET.getlist('prod_name[]', '')
        prod_qty = request.GET.getlist('prod_qty[]', '')
        prod_price = request.GET.getlist('prod_price[]', '')
        print(prod_name)
        print(prod_qty)
        print(prod_price)
        # insert customer information into customer table
        try:
            # Insert Data into customer table
            cust_tab = Customer(customer_name=cust_name, customer_contact=cust_cont, gender=gender, city_name=city_name, pincode=pincode, state_name=state, contry_name=contry)
            cust_tab.save()
            # Retrive Id from customer table
            custo_id = Customer.objects.values_list('customer_id').last()   #It is return
Tuple as result from Queryset
            custo_id = int(custo_id[0]) #It is convert the Tuple in INT
            # Insert Data into Order table
            order_tab = Orders(order_date=order_date, paid_amt=paid_amt, due_amt=due_amt, customer_id=custo_id)
            order_tab.save()
            # Insert Data into Products table
            # insert multiple data at a one time from djanog using while loop
            i=0
            while(i<len(prod_name)):
                p_n = prod_name[i]
                p_q = prod_qty[i]
                p_p = prod_price[i]
                # this is checking the variable, if variable is null so fill the varable value in database
                if p_n != "" and p_q != "" and p_p != "":
                    prod_tab = Products(product_name=p_n, product_qty=p_q, product_price=p_p, customer_id=custo_id)
                    prod_tab.save()
                i=i+1


文章来源: What is an efficient way of inserting thousands of records into an SQLite table using Django?