python清除所有非漢字_python如何去除字元串中不想要的字元

① python怎麼刪除包含指定中文的行

#!/usr/bin/envpython
#coding=utf-8

defread_del_list(path):
del_list=list()
withopen(path,'w')asfile_handle:
forrowinfile_handle:
del_list.append(row.strip())
returndel_list

deffilte_file(from_file,to_file,del_list):
withopen(from_file)asfile_handle_from:
withopen(to_file)asfile_handle_to:
forrowinfile_handle_from:
ifnotany(key_wordinrowforkey_wordindel_list):
file_handle_to.write(row)

if__name__=='__main__':
del_list=read_del_list(r"del_list.txt")#讀取過濾規則
filte_file(r"source.txt","output.txt",del_list)#過濾文件

② python字元串如何去掉英文字母以外的字元

可以利用正則表達式來去除

既然說到了字元串的操作，那麼就目前而言是沒有別的方法會比正則表達式更加方便的：

正則表達式中代表非字母的寫法如下：

[^a-zA-Z]

#code:

③ python3 如何去除字元串中不想要的字元

去除不想要的字元有很多種方法：

1、利用python中的replace()方法，把不想要的字元替換成空；

2、利用python的rstrip()方法，lstrip()方法,strip()方法去除收尾不想要的字元。

用法如下:

Python3 replace()方法

Python3 rstrip()方法

Python3 lstrip()方法

④ 用C程或python去除文件中的除",""."外的符號,只留下漢字

# -*- coding: cp936 -*-

with open("out.txt") as file:
import string
import re
s = re.sub("(,|\.)","",string.punctuation) + u"《》"
s = "[%s]" % s
out = re.sub(s,"",rece(str.__add__,file.readlines()).decode('GB2312'))
with open("res.txt","w") as file:
file.write(out.encode('GB2312'))

不能消除"\"字元以及"《》"，需要的話修改就行

⑤ python如何去除字元串中不想要的字元

問題：
過濾用戶輸入中前後多餘的空白字元
『 ++++abc123--- 『
過濾某windows下編輯文本中的』\r』:
『hello world \r\n』
去掉文本中unicode組合字元，音調
"Zhào Qián Sūn Lǐ Zhōu Wú Zhèng Wáng"
如何解決以上問題？
去掉兩端字元串： strip(), rstrip(),lstrip()

123456789101112131415

#!/usr/bin/python3 s = ' -----abc123++++ ' # 刪除兩邊空字元print(s.strip()) # 刪除左邊空字元print(s.rstrip()) # 刪除右邊空字元print(s.lstrip()) # 刪除兩邊 - + 和空字元print(s.strip().strip('-+'))

刪除單個固定位置字元：切片 + 拼接

123456

#!/usr/bin/python3 s = 'abc:123'# 字元串拼接方式去除冒號new_s = s[:3] + s[4:]print(new_s)

刪除任意位置字元同時刪除多種不同字元：replace(), re.sub()

1234567891011

#!/usr/bin/python3 # 去除字元串中相同的字元s = '\tabc\t123\tisk'print(s.replace('\t', '')) import re# 去除\r\n\t字元s = '\r\nabc\t123\nxyz'print(re.sub('[\r\n\t]', '', s))

同時刪除多種不同字元：translate() py3中為str.maketrans()做映射

1234567

#!/usr/bin/python3 s = 'abc123xyz'# a _> x, b_> y, c_> z，字元映射加密print(str.maketrans('abcxyz', 'xyzabc'))# translate把其轉換成字元串print(s.translate(str.maketrans('abcxyz', 'xyzabc')))

去掉unicode字元中音調

#!/usr/bin/python3 import sysimport unicodedatas = "Zhào Qián Sūn Lǐ Zhōu Wú Zhèng Wáng"remap = { # ord返回ascii值 ord('\t'): '', ord('\f'): '', ord('\r'): None }# 去除\t, \f, \ra = s.translate(remap)'''通過使用dict.fromkeys() 方法構造一個字典，每個Unicode 和音符作為鍵，對於的值全部為None然後使用unicodedata.normalize() 將原始輸入標准化為分解形式字元sys.maxunicode : 給出最大Unicode代碼點的值的整數，即1114111（十六進制的0x10FFFF）。unicodedata.combining:將分配給字元chr的規范組合類作為整數返回。如果未定義組合類，則返回0。'''cmb_chrs = dict.fromkeys(c for c in range(sys.maxunicode) if unicodedata.combining(chr(c))) #此部分建議拆分開來理解b = unicodedata.normalize('NFD', a)'''調用translate 函數刪除所有重音符'''print(b.translate(cmb_chrs))

⑥ 如何用python將列表中非字元串部分刪掉我有一個列表例如[「我」，「的」，0，「程序」]，請

s=["我","的",0,"程序"]
s=[value for value in s if type(value)==type("")]

⑦ Python給定一個字元串,去除字元串的非字母字元然後將每個字元串的首字母大寫

inp = input()
inp2 = ''
for i in inp:
if i.isalpha():
inp2 += i
print(inp2.upper())

⑧ python刪除特定文字下面的所有內容並保存

初學就要多查找相關資料，然後自己嘗試寫代碼，在改錯中進步：
思路：
先建立一個臨時文件（用open即可），
然後順序讀取txt文件的每一行（open，readline，用 while循環），
判斷讀取的那一行是否是abcdefg，不是就保存到臨時文件，是的話就結束循環。
關閉文件，然後可以把原來的txt文件刪除，把臨時文件更名為txt。（import os，用os操作）

⑨ python 如何去掉字元串中特定的字元

參考以下：

In [20]: aa=u\\'kasdfjskdf12334342\\'

In [21]: filter(str.isdigit,str(aa))
Out[21]: \\'12334342\\'

In [22]: filter(str.isalpha,str(aa))
Out[22]: \\'kasdfjskdf\\'

注意，這個因為要用到 str 函數，所以如果字元串中有非 ascii 碼（如漢字）會報錯。
要先去掉非 ascii 碼字元再用上面的方法。

⑩ Python中如何刪除列表中包含中文的元素

首先
代碼開頭加上
#-*- coding:gbk -*-

然後
list.remove('王五')
這樣去刪除

導航:首頁 > 編程語言 > python清除所有非漢字

python清除所有非漢字

與python清除所有非漢字相關的資料