Database development mistakes made by application

2018-12-31 12:22发布

What are common database development mistakes made by application developers?

30条回答
梦该遗忘
2楼-- · 2018-12-31 13:03

I hate it when developers use nested select statements or even functions the return the result of a select statement inside the "SELECT" portion of a query.

I'm actually surprised I don't see this anywhere else here, perhaps I overlooked it, although @adam has a similar issue indicated.

Example:

SELECT
    (SELECT TOP 1 SomeValue FROM SomeTable WHERE SomeDate = c.Date ORDER BY SomeValue desc) As FirstVal
    ,(SELECT OtherValue FROM SomeOtherTable WHERE SomeOtherCriteria = c.Criteria) As SecondVal
FROM
    MyTable c

In this scenario, if MyTable returns 10000 rows the result is as if the query just ran 20001 queries, since it had to run the initial query plus query each of the other tables once for each line of result.

Developers can get away with this working in a development environment where they are only returning a few rows of data and the sub tables usually only have a small amount of data, but in a production environment, this kind of query can become exponentially costly as more data is added to the tables.

A better (not necessarily perfect) example would be something like:

SELECT
     s.SomeValue As FirstVal
    ,o.OtherValue As SecondVal
FROM
    MyTable c
    LEFT JOIN (
        SELECT SomeDate, MAX(SomeValue) as SomeValue
        FROM SomeTable 
        GROUP BY SomeDate
     ) s ON c.Date = s.SomeDate
    LEFT JOIN SomeOtherTable o ON c.Criteria = o.SomeOtherCriteria

This allows database optimizers to shuffle the data together, rather than requery on each record from the main table and I usually find when I have to fix code where this problem has been created, I usually end up increasing the speed of queries by 100% or more while simultaneously reducing CPU and memory usage.

查看更多
人气声优
3楼-- · 2018-12-31 13:03
  1. Using an ORM to do bulk updates
  2. Selecting more data than needed. Again, typically done when using an ORM
  3. Firing sqls in a loop.
  4. Not having good test data and noticing performance degradation only on live data.
查看更多
怪性笑人.
4楼-- · 2018-12-31 13:05

For SQL-based databases:

  1. Not taking advantage of CLUSTERED INDEXES or choosing the wrong column(s) to CLUSTER.
  2. Not using a SERIAL (autonumber) datatype as a PRIMARY KEY to join to a FOREIGN KEY (INT) in a parent/child table relationship.
  3. Not UPDATING STATISTICS on a table when many records have been INSERTED or DELETED.
  4. Not reorganizing (i.e. unloading, droping, re-creating, loading and re-indexing) tables when many rows have been inserted or deleted (some engines physically keep deleted rows in a table with a delete flag.)
  5. Not taking advantage of FRAGMENT ON EXPRESSION (if supported) on large tables which have high transaction rates.
  6. Choosing the wrong datatype for a column!
  7. Not choosing a proper column name.
  8. Not adding new columns at the end of the table.
  9. Not creating proper indexes to support frequently used queries.
  10. creating indexes on columns with few possible values and creating unnecessary indexes.
    ...more to be added.
查看更多
伤终究还是伤i
5楼-- · 2018-12-31 13:06

Forgetting to set up relationships between the tables. I remember having to clean this up when I first started working at my current employer.

查看更多
弹指情弦暗扣
6楼-- · 2018-12-31 13:07

Not executing a corresponding SELECT query before running the DELETE query (particularly on production databases)!

查看更多
泪湿衣
7楼-- · 2018-12-31 13:07

Not having an understanding of the databases concurrency model and how this affects development. It's easy to add indexes and tweak queries after the fact. However applications designed without proper consideration for hotspots, resource contention and correct operation (Assuming what you just read is still valid!) can require significant changes within the database and application tier to correct later.

查看更多
登录 后发表回答