Wrap multiple strings in HTML the React way

2019-06-20 02:15发布

I'm building an entity highlighter so I can upload a text file, view the contents on the screen, then highlight words that are in an array. This is array is populated by the user when they manually highlight a selection e.g...

const entities = ['John Smith', 'Apple', 'some other word'];

This is my text document that is displayed on the screen. It contains a lot of text, and some of this text needs to be visually highlighted to the user once they manually highlight some text, like the name John Smith, Apple and some other word

Now I want to visually highlight all instances of the entity in the text by wrapping it in some markup, and doing something like this works perfectly:

getFormattedText() {
    const paragraphs = this.props.text.split(/\n/);
    const { entities } = this.props;

    return paragraphs.map((p) => {
        let entityWrapped = p;

        entities.forEach((text) => {
        const re = new RegExp(`${text}`, 'g');
        entityWrapped =
            entityWrapped.replace(re, `<em>${text}</em>`);
        });

        return `<p>${entityWrapped}</p>`;
    }).toString().replace(/<\/p>,/g, '</p>');
}

...however(!), this just gives me a big string so I have to dangerously set the inner HTML, and therefor I can't then attach an onClick event 'the React way' on any of these highlighted entities, which is something I need to do.

The React way of doing this would be to return an array that looks something like this:

['This is my text document that is displayed on the screen. It contains a lot of text, and some of this text needs to be visually highlighted to the user, like the name', {}, {}, {}] Where the {} are the React Objects containing the JSX stuff.

I've had a stab at this with a few nested loops, but it's buggy as hell, difficult to read and as I'm incrementally adding more entities the performance takes a huge hit.

So, my question is... what's the best way to solve this issue? Ensuring code is simple and readable, and that we don't get huge performance issues, as I'm potentially dealing with documents which are very long. Is this the time that I let go of my React morals and dangerouslySetInnerHTML, along with events bound directly to the DOM?

Update

@AndriciCezar's answer below does a perfect job of formatting the array of Strings and Objects ready for React to render, however it's not very performant once the array of entities is large (>100) and the body of text is also large (>100kb). We're looking at about 10x longer to render this as an array V's a string.

Does anyone know a better way to do this that gives the speed of rendering a large string but the flexibility of being able to attach React events on the elements? Or is dangerouslySetInnerHTML the best solution in this scenario?

3条回答
仙女界的扛把子
2楼-- · 2019-06-20 02:29

Here's a solution that uses a regex to split the string on each keyword. You could make this simpler if you don't need it to be case insensitive or highlight keywords that are multiple words.

import React from 'react';

const input = 'This is a test. And this is another test.';
const keywords = ['this', 'another test'];

export default class Highlighter extends React.PureComponent {
    highlight(input, regexes) {
        if (!regexes.length) {
            return input;
        }
        let split = input.split(regexes[0]);
        // Only needed if matches are case insensitive and we need to preserve the
        // case of the original match
        let replacements = input.match(regexes[0]);
        let result = [];
        for (let i = 0; i < split.length - 1; i++) {
            result.push(this.highlight(split[i], regexes.slice(1)));
            result.push(<em>{replacements[i]}</em>);
        }
        result.push(this.highlight(split[split.length - 1], regexes.slice(1)));
        return result;
    }
    render() {
        let regexes = keywords.map(word => new RegExp(`\\b${word}\\b`, 'ig'));
        return (
            <div>
                { this.highlight(input, regexes) }
            </div>);
    }
}
查看更多
再贱就再见
3楼-- · 2019-06-20 02:41

Have you tried something like this?

The complexity is number of paragraphs * number of keywords. For a paragraph of 22,273 words (121,104 characters) and 3 keywords, it takes 44ms on my PC to generate the array.

!!! UPDATE: I think this is the clearest and efficientest way to highlight the keywords. I used James Brierley's answer to optimize it.

I tested on 320kb of data with 500 keywords and it loads pretty slow. Another idea it will be to render the paragraphs progressive. Render first 10 paragraphs, and after that, at scroll or after some time, render the rest.

And a JS Fiddle with your example: https://jsfiddle.net/69z2wepo/79047/

const Term = ({ children }) => (
  <em style={{backgroundColor: "red"}} onClick={() => alert(children)}>
    {children}
  </em>
);

const Paragraph = ({ paragraph, keywords }) => {
  let keyCount = 0;
  console.time("Measure paragraph");

  let myregex = keywords.join('\\b|\\b');
  let splits = paragraph.split(new RegExp(`\\b${myregex}\\b`, 'ig'));
  let matches = paragraph.match(new RegExp(`\\b${myregex}\\b`, 'ig'));
  let result = [];

  for (let i = 0; i < splits.length; ++i) {
    result.push(splits[i]);
    if (i < splits.length - 1)
      result.push(<Term key={++keyCount}>{matches[i]}</Term>);
  }

  console.timeEnd("Measure paragraph");

  return (
    <p>{result}</p>
  );
};


const FormattedText = ({ paragraphs, keywords }) => {
    console.time("Measure");

    const result = paragraphs.map((paragraph, index) =>
      <Paragraph key={index} paragraph={paragraph} keywords={keywords} /> );

    console.timeEnd("Measure");
    return (
      <div>
        {result}
      </div>
    );
};

const paragraphs = ["Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla ornare tellus scelerisque nunc feugiat, sed posuere enim congue. Vestibulum efficitur, erat sit amet aliquam lacinia, urna lorem vehicula lectus, sit amet ullamcorper ex metus vitae mi. Sed ullamcorper varius congue. Morbi sollicitudin est magna. Pellentesque sodales interdum convallis. Vivamus urna lectus, porta eget elit in, laoreet feugiat augue. Quisque dignissim sed sapien quis sollicitudin. Curabitur vehicula, ex eu tincidunt condimentum, sapien elit consequat enim, at suscipit massa velit quis nibh. Suspendisse et ipsum in sem fermentum gravida. Nulla facilisi. Vestibulum nisl augue, efficitur sit amet dapibus nec, convallis nec velit. Nunc accumsan odio eu elit pretium, quis consectetur lacus varius"];
const keywords = ["Lorem Ipsum"];

class App extends React.Component {
  constructor(props) {
    super(props);

    this.state = {
      limitParagraphs: 10
    };
  }

  componentDidMount() {
    setTimeout(
      () =>
        this.setState({
          limitParagraphs: 200
        }),
      1000
    );
  }

  render() {
    return (
      <FormattedText paragraphs={paragraphs.slice(0, this.state.limitParagraphs)} keywords={keywords} />
    );
  }
}

ReactDOM.render(
  <App />, 
  document.getElementById("root"));
<script src="https://cdn.jsdelivr.net/lodash/4.17.4/lodash.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react/15.1.0/react.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react/15.1.0/react-dom.min.js"></script>

<div id="root">
</div>

查看更多
聊天终结者
4楼-- · 2019-06-20 02:51

The first thing I did was split the paragraph into an array of words.

const words = paragraph.split( ' ' );

Then I mapped the words array to a bunch of <span> tags. This allows me to attach onDoubleClick events to each word.

return (
  <div>
    {
      words.map( ( word ) => {
        return (
          <span key={ uuid() }
                onDoubleClick={ () => this.highlightSelected() }>
                {
                  this.checkHighlighted( word ) ?
                  <em>{ word } </em>
                  :
                  <span>{ word } </span>
                }
          </span>
        )
      })
    }
  </div>
);

So if a word is double clicked, I fire the this.highlightSelected() function and then as I conditionally render the word based on whether or not it is highlighted.

highlightSelected() {

    const selected = window.getSelection();
    const { data } = selected.baseNode;

    const formattedWord = this.formatWord( word );
    let { entities } = this.state;

    if( entities.indexOf( formattedWord ) !== -1 ) {
      entities = entities.filter( ( entity ) => {
        return entity !== formattedWord;
      });
    } else {
      entities.push( formattedWord );
    }  

    this.setState({ entities: entities });
}

All I am doing here is either removing or pushing the word to a an array in my component's state. checkHighlighted() will just check if the word being rendered exists in that array.

checkHighlighted( word ) {

    const formattedWord = this.formatWord( word );

    if( this.state.entities.indexOf( formattedWord ) !== -1 ) {
      return true;
    }
    return false;
  }

And finally, the formatWord() function is simply removing any periods or commas and making everything lower case.

formatWord( word ) {
    return word.replace(/([a-z]+)[.,]/ig, '$1').toLowerCase();
}

Hope this helps!

查看更多
登录 后发表回答