How to capitalize a string?

2020-08-01 08:13发布

问题:

I need to capitalize a string:

john doe -> John Doe

How to do this?

I suppose need to use NORMALIZE_AND_CASEFOLD, but it returns a lower case.

https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#normalize_and_casefold

NORMALIZE_AND_CASEFOLD

NORMALIZE_AND_CASEFOLD(value[, normalization_mode]) Description

Takes a STRING, value, and performs the same actions as NORMALIZE, as well as casefolding for case-insensitive operations.

NORMALIZE_AND_CASEFOLD supports four optional normalization modes:

Value Name Description NFC Normalization Form Canonical Composition Decomposes and recomposes characters by canonical equivalence. NFKC Normalization Form Compatibility Composition Decomposes characters by compatibility, then recomposes them by canonical equivalence. NFD Normalization Form Canonical Decomposition Decomposes characters by canonical equivalence, and multiple combining characters are arranged in a specific order. NFKD Normalization Form Compatibility Decomposition Decomposes characters by compatibility, and multiple combining characters are arranged in a specific order.

回答1:

Below is for BigQuery Standard SQL and with use of JS UDF (first query) and SQL UDF (second one)

#standardSQL
CREATE TEMPORARY FUNCTION capitalize(str STRING)
RETURNS STRING
LANGUAGE js AS """
  return str.replace(
      /\\w\\S*/g,
      function(txt) {
          return txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase();
      }
  );
""";
SELECT str, capitalize(str) capitalized_str
FROM `project.dataste.table`  

or

#standardSQL
CREATE TEMPORARY FUNCTION capitalize(str STRING) AS ((
  SELECT STRING_AGG(CONCAT(UPPER(SUBSTR(word, 1, 1)), LOWER(SUBSTR(word, 2))), ' ' ORDER BY pos)
  FROM UNNEST(SPLIT(str, ' ')) word WITH OFFSET pos
));
SELECT str, capitalize(str) capitalized_str
FROM `project.dataste.table`

you can test above with dummy data as

WITH `project.dataste.table` AS (
  SELECT 'john doe' AS str UNION ALL
  SELECT 'abc xyz'
)

with result for both options:

Row str         capitalized_str  
1   john doe    John Doe     
2   abc xyz     Abc Xyz    

Added least verbose / more streamlined version for JS UDF

#standardSQL
CREATE TEMPORARY FUNCTION capitalize(str STRING)
RETURNS STRING
LANGUAGE js AS """
  return str.replace(/\\b(.)/g, function(match, chr){return chr.toUpperCase();});
""";
SELECT str, capitalize(str) capitalized_str
FROM `project.dataste.table` 


回答2:

Maybe that is a little bit more complex than the easier way...

CONCAT( UPPER( SUBSTR( field,1,1 ) ), SUBSTR( Bets.status_code,2,CHAR_LENGTH(field) ) ) as CapitalizedField