Whenever I run the code. it gives me \r\n
with spaces. I used strip function but it didn't work. How to resolve this issue? Here is the link: https://ibb.co/VtVV2fb\
import scrapy
from .. items import FetchingItem
class SiteFetching(scrapy.Spider):
name = 'Site'
start_urls = ['https://www.rev.com/freelancers']
transcription_page = 'https://www.rev.com/freelancers/transcription'
def parse(self, response):
items = {
'Heading': response.css('#sign-up::text').extract(),
'Earn_steps': response.css('.pb2 .lh-copy::text , .mb1::text , .mb3 .lh-copy::text').extract(),
}
yield response.follow(self.transcription_page, self.trans_faqs, meta={'items':items})
def trans_faqs(self, response):
items = response.meta['items']
names = {
'name1': 'FAQ1',
'name2': 'FAQ2',
}
finder = {
'find1': '#whatentailed p::text , #whatentailed .mr3::text',
'find2': '#requirements p::text , #requirements .mr3::text',
}
for name, find in zip(names.values(), finder.values()):
items[name] = response.css(find.strip()).extract()
yield items
strip()
can remove\r\n
only at the end of string, but not inside. If you have\r\n
inside text then usetext = text.replace(\r\n', '')
it seems you get
\r\n
in list created byextract()
so you have to use list comprehension to remove from every element on listEDIT: to remove spaces and
\r\n
between sentences you cansplit('\r\n')
to create list with sentences. then you canstrip()
every sentence. And you can' '.join()
all sentences back to one string.The same in one line
The same with module
re